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DIGITAL WATERMARKING 

FIELD OF INVENTION 

The present invention relates to digital watermarking of 
data including image, video and multimedia data. 
Specifically, the invention relates to the insertion and extrac- 
tion of embedded signals for purposes of watermarking, in 
wbich the insertion and extraction procedures are repeatedly 
applied to subregions of the data. When these subregions 
correspond to ihe 8x8 pixel blocks used for MPEG and 
JPEG compression and decompression, the watermarking 
procedure can be tightly coupled with these compression 
algorithms to achieve very significant savings in computa- 
tion. 

BACKGROUND OF THE INVENTION 

The proliferation of digitized media such as image, video 
and multimedia is creating a need for a security system 
which facilitates the identification of the source of the 
material. 

Content providers, i.e. owners of works in digital data 
form, have a need to embed signals into videoAmage/ 
multimedia data which can subsequently be detected by 
software and/or hardware devices for purposes of authenti- 
cating copyright ownership, control and management. 

For example, a coded signal might be inserted in data to 
indicate that the data should not be copied. The embedded 
signal should preserve the image fidelity, be robust to 
common signal transformations and resistant to tampering. 
In addition, consideration must be given to the data rate that 
can be provided by the system, though current requirements 
are relatively low — a few bits per frame. 

In U.S. patent application Ser. No. 08/534^94, filed Sep. 
28, 1995,. entitled "Secure Spread Spectrum Watermarking 
for Multimedia Data" now abandoned and assigned to the 
same assignee as the present invention, which is incorpo- 
rated herein by reference, there was proposed a spread 
spectmm watermarking method which embedded a water- 
mark signal into perceptually significant regions of an image 
for the purposes of identifying the content owner and/or 
possessor. A strength of this approach is that the watermark 
is very difficult to remove. In fact, this method only allows 
the watermark to be read if the original image or data is 
available for comparison. This is because the original spec- 
trum of the watermark is shaped to that of the image through 
a non -linear multiphcative procedure and this spectral shap- 
ing must be removed prior to detection by matched filtering 
and the watermark is inserted into the N largest spectral 
coefiScients, the ranking of which is not preserved after 
watermarking. Thus, this method does not allow software 
and hardware devices to directly read embedded signals. 

In an article by Cox ct al., entitled "Secured Spectrum 
Watermarking for Multimedia*' available at http:// 
www.Deci.nj.com/tr/index.html (Technical Report No. 
95-10) spread spectrum watermarking is described which 
embeds a pseudo-random noise sequence into the digital 
data for watermarking purposes. 

The above prior art watermark extraction methodology 
requires the original image spectrum be subtracted from the 
watermark image spectrum. This restricts the use of the 
method when there is no original image or original image 
spectmm available. One application where this presents a 
significant difficulty is for third party device providers 
desiring to read embedded information for operation or 
denying operation of such a device. 
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In U.S. Pat. No. 5319,735 by R. D. Preuss el al entiUcd 
"Embedded Signalling" digital information is encoded to 
produce a sequence of code symbols. The sequence of code 
symbols is embedded in an audio signal by generating a 

5 corre^nding sequence of spread spectrum code signals 
representing the sequence of code symbols. The frequency 
components of the code signal being essentially confined to 
a preselected signaling band lying within the bandwidth of 
the audio signal and successive segments of the code ^gnal 

10 corresponds to successive code symbols in the sequence. 
The audio signal is continuously frequency analyzed over a 
frequency band encompassing the signalling band and the 
code signal is dynamically filtered as a function of the 
analysis to provide a modified code signal with frequency 

15 component levels which arc, at each time instant, essentially 
a preselected proportion of the levels of the audio signal 
frequency components in corresponding frequency ranges. 
The modified code signal and the audio signal are combined 
to provide a composite audio signal in which the digital 

20 information is embedded. This component audio signal is 
then recorded on a recording medium or is otherwise sub- 
jected to a U-ansmission channel. Two key elements of this 
process are the spectral shaping and spectral equalization 
that occur at the insertion and extraction stages, respectively, 

25 thereby allowing the embedded signal to be extracted with- 
out access to the unwatermarked original data. 

In U.S. patent application Ser. No. 08/708331, filed Sep. 
4, 1996,entitled "A Spread Spectrum Watermark for Embed- 
ded Signaling" by Cox; now U.S. Pat. No. 5348,155 and 

30 incorporated herein by reference, there is described a 
method for extracting a watcranark of embedded data from 
watermarked images or video without using an original or 
unwateraiarked versk)n of the data. This work can be viewed 
as an extension of the original work of Preuss et al from the 

55 audio domain to images and video. 

This method of watermarking an image or image data for 
embedding signaling requires that the DCT (discrete cosine 
transform) and its inverse of the entire image be computed. 
There are fast algorithms for computing the DCT in N log 

^ N time, where N is the number of pixels in the image. 
However, for N=512x512, the computational requirement is 
still high, particularly if the encoding and extracting pro- 
cesses must occur at video rates, i.e. 30 frames per second. 
This method requires approximately 30 times the oomputa- 
tioc needed for MPEG- II decompression. 

One possible way to achieve real-time video watermark- 
ing is to only watermark every N'* frame. However, content 
owners wish to protect each and every video frame. 
Moreover, if it is known which frames contain embedded 
signals, it is simple to remove those frames with no notice- 
able degradation in the video signal. 

In U.S. patent application Ser. No. 08/715,953, filed Sep. 
19, 1996,entitled "Watermarking of Image Data Using 
MPEG/SPEG Cbefficients" by Cox, and incorporated herein 
by reference, there is described an alternative method, which 
is to insert the watermark into nxn blocks of the image 
(subimages) where d«N. Then the computation cost is 

60 —/I logrt = N logn. 

For N=512x5l2-2^® and n=8x8«2®, the asymptotic sav- 
ing is only a factor of 3. However, empirically the cost of 
6S computing the DCT over the entire image may be signifi- 
cantly higher when cache, loop unfolding and other effi- 
ciency issues are considered. Thus, the practical difference 
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may approach a 30 fold savings. More importantly, if the FIG. 6 is a graphic representation of rotation of PN 

block size is chosen to be 8x8, i.e. the same size as that used sequences; 

for MPEG image compression, then it is possible lo lightly FIG. 7 is a graphical representatioo of an 8x8 block 

couple the watermark insert wn and extraction procedures to shown the spatial relation of averaged terms; 

those of the MPEG compression and decompression algo- 5 RG. 8 is a schematic block diagram of a method for 

rithms. Considerable computational saving can then be inserting watermarks in accordance with the present inveo- 

achieved since the most expenses computations relate to the tion; and 

calculation of the DCT and its inverse and these steps are FIG. 9 is a schematic block diagram of a method for 

already computed as part of the compression and decom- extracting watermarks in accordance with the present inven- 

pression algorithm. The incremental cost of watermarking is 10 tion. 

then very small, typically less than 5% of the computational DETAILED DESCRIPTIOM 

requirements associated with MPEG. or- ^ ^ ^ r^r^o ^ l l • 

1, . . ■ , ^ , Rcfcrrmg now to the figures, and FIGS. 1 through 4 m 

Ihc pr^nt invention improves the reliabibty of the ji^ular. there is shown schematic block diagrams of a 

mvenlion described m the 08/715,953 apphcation, now ^^^^^ detecting watermarks in 

pendmg by storing watermark mformation into submiages, i5 ^ ^^^^^ .^^^^^ 

and extracting watermark information from sub images, in a i .u r n • ^ • r u j * 

.•tf . c .U.J 1- la ibc following description, reference may be made to 

manner dififerent from that described earUer. . ^ . . u r u-i * 

image data or unages. While the mvention has apphcability 

SUMMARY OF THE INVENTION ^ image data and images, it will be understood that the 

20 teachings herein and the invention itself are equally appU- 

The present invention improves the reliability of the prior cable to video, image and muUimedia data and the term 

systems by systematically varying the order in which water- "image" and "image data" will be understood to include 

mark signal components are inserted into each subimage, by these terms where applicable. As used herein, ''watermark" 

inserting only part of the watermark signal into each will be understood to include embedded data, symbols, 

subimage, and, during watermark detection, by combining images, instructions or any other identifying information, 

the watermark signals found in groups of subimages to In the following description, reference is made lo prooe- 

reconstruct the original watermark signal before testing for dures described in U.S. patent application Ser. No. 08/534, 

correlation with any predefined watermarks. 894 for inserting and extracting or detecting a watermark in 

For detection, a reverse transformation is applied to each images as INSERT-ORIGINAL and EXTRACT- 

subimage to reconstmct the watermark information that was 30 ORIGINAL, respectively. Reference is made to procedures 

stored in that subimage. The resulting signals are then described in U.S. patent apphcation Ser. No, 08/708331 

averaged together to reconstruct the whole watermark, and filed Sep. 4, 1996jnow U.S. Pat. No. 5,848.155 for inserting 

to reduce noise. Finally, this reconstructed watermark is and extracting or detecting watermarks in images as 

compared against a predefined set of watermark signals to INSERT- WHOLE and EXTRACT- WHOLE, respectively, 

determine which one was inserted into the image. 35 And reference is made to procedures described in U.S. 

Aprincipalobjeciofthepresenl invention is therefore, the P^t^ot apphcation Ser. No. 08^^15,953 for inserting and 

provision of inserting a subset of a watermark into a subset extracting or detecting watermarks in images as INSERT- 

of subregions of data to be watermarked. MP^G-A and EXTRACT-MPEG-A, respectively. 

Another object of the invention is the provision of a Jj^; ^ '^^"^ ' schematic block diagram of INSERT- 

digital watermarking system in which a watermark is ^ WHOLE procedure for msertmg watermark^ mlo images^ 

e^iracted by averaging the watermarked signal from subre- TTie watermark signal, m the form of a fimte sequence of 

gionsof watermarked data, and then correlating the resulting ^3^^*^ ^ alphabet is provided as an mput to 

. 1 . J an error correction encoder 10 which transforms this 

siiinal to determine the watermark. . . • . . . 

^ . , ^ sequence mto another sequence that contains redundant 

A further object of the invention is the provision of a information. The output of encoder 10 is provided to a 

digital watermarking system in which the watermark is p^.^^pper U. which maps each symbol of the encoded 

composed of two portions, a venfication poruon and a ^giennark into a pre-^ecified pseudo-random noise (PN) 

synchronization portion, in order to improve watermark ^ ^^^p^^ PN-mapper 11 is provided to a 

extraction reliability spectral transformer 12, which converts the pseudo-random 

Further and still other objects of the invention will jq noise sequence into the frequency domain. The conversion 

become more clearly apparent when the following descrip- preferably is by discrete cosine transform (DCT), however, 

tion is read in conjunction with the accompanying drawing. fast fourier transform, wavelet type decomposition and the 

like may also be used for frequency conversion. 

BRIEF DESCRIPTION OF THE DRAWINGS . Concurrcody, the data to be watermarked is provided to 

FIG. 1 is a schematic block diagram of watermark inser- 55 another spectral transformer 13. The outputs of the two 

tion orocedure- spectral transformers 12 and 13 are then provided as inputs 

. ' . • I., , J. r * I to a spectral shaper 14, which modifies the spectral proper- 

FIG. 2 IS a schematic b^ock diagram of a watermark pseudo-random noise codes from spectral tr^- 

msertion procedure in accondancc with the teachmgs of the ^^^^^^ ^ ^^j^ watermark when added to the image 

present mvention, ^^^^ spectrally transformed data to be watermarked, 

FIG. 3 is a schematic block diagram of a watermark £^0^ spectral transformer D, is also provided as an input to 

extraction procedure; a delay 15. The output of the spearal shaper 14 is then added 

FIG. 4 is a schematic block diagram of a watermark lotheoutpuiof delay 15 at a summer 16. The summer output 

extraction procedure in accordance with the teachings of the is subject to an inverse transform 17. The result of the 

present invention; 65 inverse transform is watermarked data. 

FIG. 5 is a graphic representation of a zigzag pattern INSERT-MPEG-Adiflfers from INSERT-WHOLEby seg- 

useful for vectorizing subimages; mcnting the data to be watermarked into multiple blocks. 
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sucb as 8x8 pixel subimages or subregions. Each block of into blocks by data segmenter 40, which corresponds to the 

data then has the watermark inserted according to the above data segmenter 24 used during the insertion procediire io 

described method. That is, for each 8x8 subimagc or FIG. 2, Each of the data blocks is provided to a re^)ective 

subregion, a pseudo-random number (PN) sequence is spectrum normalizer 4 la, 4lb, etc. to produce a sigpal 

inserted into the DCT coefficients after suitable spectral 5 resembling the subwatcrmark that was inserted into the 

shaping. The procedure is repeated for all such subimages or respective data block. These inserted subwatcrmark signals 

subregions. The size of the subimagc or subregion is prcf- ^re then used as inputs into a watermark combiner 42. In the 

erably 8x8, but it can be of other sizes, such as 2x2, 3x3. 4x4 combiner 42, parts of the watermaric that appear redundanUy 

or 16x16. several subwatennarks are averaged together to reduce 

FIG. 2 shows a schematic block diagram of a watermark ^ noise. The output of the watermark combiner 42 is provided 

insertion procedure in accordance with teachings of the ^ ^^^^^ ^ symbol separator 43 which divides the 

present invention. The watermark signal is processed into a watermark into pans, each of which corresponds to one 

noise specuiim signal by the error correction encoder 20, the symbol from the encoded watermark signal (the output of 

PN mapper 21, and the spectral transformer 22, in the same e^ror correction encoder 20 in FIG. 2). 

M^.'^'i^*^^^ ^°^?K±n^?in^^ u These symbols from separator 43 are provided as inputs 

unlike INSERT-WHOLE or INSERT-MPEG-A, the water- ^ ^^^^^^^^^ f^^^^^^^ 4^*^^^ 

mark is then used as an mpnt to a watermark segmenter 23 ^^-^^ ^^^^^^^ ^ correlators and a decision 

which systematically separates the watermark mto several ^^^^-^^ ^ ^ 3 ^^^^^ watermark 

subwatcrmarks. Any portion of the onginal watermark identifiers are symbols from the alphabet used in the original 

might appear redundanUy m several of the resulting subwa- ^acoded watermark signal. The identified symbols arc reas- 

termarks. ConcurrenUy, the data to be watermarked ^ i^d ^^^^^^ ^ ^^^^^ watermark by the symbol 

as an input to dau segmenter 24, wluch segments the data combiner 45. FinaUy, the resulting encoded watermark is 

into blocks or subregions, such as 8x8 subimages, as in ^^^^ . ^^^^ corrector 46. 

INSERT-MPEG-A. Each of the subwatcrmarks output by ■ ^ 1, 

. , . 'ti • »u *-» ^ • * The insertion and extraction procedures will now be 

the watermark segmenter 23 is then inserted into a data „ . - j . 1 i iKTecm- rintoixt at j 

uiiu c.zT . J- ^ ^-ys -ich xk- dcscnbed m more detail. In INSERT-ORIGINAL and 

block by one of the watermark inserters 25fl, 250, etc. The ovro A/-^rtni/^ixTAr *u u* * • . u-j • 1 dki 

procedure used by the watermark inserters 25a, 256, etc.. is EXTRACT-ORIGINAL the object is to embed a single PN 

Ihe same procedure described connection with watermark (P^^^^, random number) sequence mto an miage when the 

to • T7ir> 1 ♦ ^««k ,v. ^AA^A oHginal ifflagc is available at the time of extraction. The 

mserter 18 m HG. 1. That is, each subwatcrmark is added . r • . ^ -.i. nxr j * 

, „ , tr A A 4 u\ \. mformation associated with the PN sequence IS assumed to 

into a spectrally transformed data block after spectral ■^(^ , .... - ■ ■ * j 

. j.u n- j* *u f ^ui-. be stored m a database together with the ongmal image and 

shaping, and the resultmg data IS then transformed back into . ..^ , . , , . j i 

1- I -1 TT- II *u i,^^ wi^t,^ the spectral location of the embedded watermark. The loca- 

the spatial domam. Fmally, the watermarked data blocks _jj 

fa)mihewatermarkinserte425<i,25A. etc., are assembled by °^ ^« watermarked components to be recorded 

data combiner 26 to produce watermarked data. the unplementatK.n approximated the N perceptu- 

, . . ^ ^ ally most significant regions of the watermark by the N 

.^.Lt^^.^ schematic block diagram of the 35 u^^st ooeffiSents. However, this ranking was not invariant 

EXTOACT-WHOLE procedure. Tie watermarked miage ^ watermaridng process. Tlie N largest coefficients may 

video or multimedia data is first used as mput mto a spectra ^ ^g.^^_^, ^^^^^^ before insert- 

normahzer 30 to undo any previously performed spectral . v^atermark. 

shapine. If the data contains a watermark, then the output of . , .... ,, . 

the spectral normalizer 30 wUl resemble .the spectral trans- 40 , ^° ^^^^^ problem, the present mvenUon 

formation of the PN coding of that watermark (the signal P^^^^ ^ watcrrnark m predetermined locations of the 

that was input to the spectral shaper 14 in FIG. 1). The spectrum, typically the first NcoeflScicnts^ ^^^^^t!] 

output of the spectral normaUzer 30 is then used as an input Predetermined locatioi^ could be used, though such loca- 

to several cori^elators 3la, 31*, etc., which lest the water- ^^^^^ *>.f ^^e perceptually significant regions of 

mark with the PN codes used to represent the various 45 the spectrum if the watermark is to survive common signals 

symbols that the encoded watermark might contain (i.e. each transformaUons such as compression, scahng, etc. 

correlator tests for one PN code that is used to encode a More generaUy, the information to be embedded is a 

symbol by the PN mapper 11 of FIG. 1). The outputs of the sequence of m symbols drawn from an alphabet A (e.g. the 

correlators 31fl, 316, etc.. are used as inputs to a decision binary digits or the ASCII symbols). This data is then 

circuit 32, which determines the most likely sequence of 50 supplemented with additional symbols for error detecUon 

symbols. FinaUy, this sequence is corrected by an error and correction. Each symbol is then spread spectrum 

corrector 33, which performs the inverse coding that was modulated, a process that maps each symbol mto a umque 

performed by the error correction encoder 10 in FIG. 1. The PN sequence known as a chip. The number of bits per chip 

result is the extracted watermark. ^ P^^^ " longer the chip length, the higher the detected 

In EXTRACT-MPEG-A, the data from which a water- ss signal-to-noise ratio will be, but this is at the expense of 

mark is to be extracted is first segmented into several blocks, signaling bandwidth. 

sucb as 8x8 subimages, exactly as in INSERT-MPEG- A. The power spectrum of the PN sequence is white, i.e. flat, 

The signal from each subimage is then normaUzed and used and is therefore shaped to match that of the "noise", i.e. the 

as input into a bank of correlators similar to the correlators imagc/video/audio/or multimedia data into which the water- 

31a, 316, etc. in FIG. 3. The output from the correlators is eo "^^rk is to be embedded. Il is this spectral shaping that must 

then averaged with the outputs of corresponding correlators *>e modified from the prior methods so that the exUaction 

from other subimages, and the resulting average correlations P^cess no longer requires the original image. To do this, 

are used as inputs into the decision circuit 32 for subsequent each coefficient of the watermarked spectrum is scaled by 

processing as described above. ^^cal average of the power in the unage spectral coef- 

FIG- 4 shows a schematic block diagram of a watermark 65 rather than the coefficient itself, i.e. 
extraction procedure in accordance with the teachings of the 

present invention. The watermarked data is first segmented /f'-/;+<3avg(l/;^)H', (i) 
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The averagiog is the averaging of the absolute coefficieot found to improve the detection response and significantly 

values and not the coefiBdeni values themselves. This is reduced the computaiion requirement associated with each 

efifectively estimating the average power present at each block. 

frequency. Other averaging procedures are possible, for In practicing the present invention preferably there is a 

example, averaging over several frames or average of local 5 unique PN sequence for each symbol in the alphabet. The 

neighborhoods of 8x8 blocks. method is relatively robust to clipping since the detector 

This average may be obtained in several ways. It may be output reduces linearly with the quantity of 8x8 subimagc 

a local average over a two dimensional region. Alternatively, blodcs in the image. For DVD (digital video disk) embedded 

the two dimensional spectmm may be sampled to form a one signaling for APS (analog protection system) and CGMS 

dimensional vector and a one dimensional local average may lO (copy generation management system), there would be a 

be performed. One dimensional vectorization of the two total of 8 or 16 PN sequences. 

dimensional 8x8 OCT coefiScieats is aheady performed as The number of 8x8 blocks in a 512x512 image is 4096, 

part of MPEG II. The average may be a simple box or suggesting thai significantly more than one of 16 symbols 

weighted average over the neighborhood. can be embedded in an image or video frame. Assume, for 

For video data, temporal averaging of the spectral cocf- 15 example, that it is desired to embed 1 out of 128 symbols in 

ficients over several frames can also be applied. However, an image. It is necessary to perform 128 parallel oorrcla- 

sincc several frames arc needed for averaging at the spectral tions. This is computationally tractable but hardware implc- 

normalizatioa stage of the extractor, the protection of indi- mentations of each correlation become more complex. An 

vidual video frames taken in isolation may not be possible. alternative method is to only use two binary symbols. It may 

For this reason, the present invention treats video as a very 20 be preferable to associate more than one PN sequence with 

large collection of still images. In this way, even individual each of the two binary symbols or bits in order to increase 

video frames are copy protected. the difficulty of intentionally removing the watermark. In 

In order to extract the watermark, it is necessary to this case, there are only two correlators and a binary string 

perform the spectral nonnalization, in which the previously may be embedded into the image. The raw bit error rate will 

performed spectral shaping procedure is inverted. In the 25 be very high due by the low detector output However, this 

present invention, the original unwatermarked signal is not can be reduced to acceptable levels by using error correcting 

available. Thus, the average power of the frequency codes, such as Rccd-Solomon (RS). RS codes are robust to 

coefficients, avg(|f j), is approximated by the average of the burst error which may occur because of clipping of the 

watermarked signal, i.e. avg(|f.D image. Other error correcting codes may also be used. 

30 When using this method, it is necessary for the receiver to 

avg(l^tH-»<//D (2) know the start location of the encoded block. The start 

^^**ion may not be obvious, particularly when the image 

This is approximately true smcc aavg(|f,-|)W, where subjected to clipping. However, convention syn- 

W, is the watermark component, and ais a constant typically chronizing methods can be used; such as preceding each 

in the range between O.L and 0.01. 35 ^^^^ ^ special or unique symbol or string of symbols. 

The normalization stage then divides each coefficient (f,') ^ watermark, each 8x8 block is treated as an 

in the received signal by the local average avg (If/|) m the individual subimagc or subrcgion. The DCT of the subimage 

neighborhood. (1^^^ computed and the two dimensional DCT is vectorized 

^» in the zigzag pattern shown in RG. 5, although other 

40 patterns are also possible. Ttese two stages constitute most 

^ ^ fi^oavgUm (3) calculations but are part of the MPEG encoding 

avgi]fi'\) avgi\fi\) process. Next, a PN noise sequence {w^. . . w„} is inserted 

into the DCT coefficients using Equation 1 as before. The 

flv^l/f'D ' length of the PN sequence cannot exceed 64 (in an 8x8 

45 block) and is typically much shorter, in the range of 11 to 25. 

_ ^ ^ . ^ ^ . If only a single code is to be inserted into the image. 

Hie first term, on the nght hand side (RHS) of Equation ^^^^ ^ -^^^^ ^^^^ 

(^)' 720 X 480/64 -5400 blocks. However, a variation may be 

performed at this point in the procedure. Within each row of 

^ 50 blocks, the PN sequence is cychcally rotated by one fre- 

"^'^^^ quency coefficieni prior to insertion in the subsequent block. 

Similarly, the PN sequence is cyclically rotated by one 

is considered a noise term. This term was not present in the frequency coefficient at the start of each new row. FIG. 6 

system described in U.S. patent application Ser. No. 08/534, illustrates an order of rotauons. 

894, because access to the unwatermarked coefficients 55 The purpose of these rotations or shifts is to improve the 

allowed this term to be removed. The second term aW^ is the response of the watermark extraction stage. Earlier expcri- 

original watermark signal which can now be detected using mcnts revealed that certain DCT coefficients were more 

conventional correlation. difficult to estimate than others. The location at these cocf- 

If the watermark is extracted &om any single 8x8 block, ficients varied from image to image. However, within an 

the detector reliability is very low. If, however, the water- 60 image, the coefficient could be consistently poor, 

marks extracted from each 8x8 block are first added together Consequently, without shifting, one or more of the estimated 

and the averaged watermark is then applied to the correlator, watermark coefficients could be significantly degraded rela- 

then a very strong and unambiguous response is obtained. live to the other watennark coefficients, thereby reducing the 

This differs from the method described in U.S. patent detector performance. Conversely, shifting significantly 

application Ser. No. 08/715,953 in which correlation 65 reduces the effect a poor DCT coefficients has on a single 

occurred within each block and the output from each corr- watermark coefficient and the detector performance is mark- 

elator was averaged together. The present invention was cdly improved. Note that any cyclic pattern can be used. 
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Further modLfications are useful oace rotation of the 
watermark has been iotroduced. First, ibe length of the 
watermark may tiow be significantly greater than 64.Then, 
for each block only a small subset of the watermark (say 
five) cocfiBcicDis is inserted into the first five DCT cocfl5- 
ctcnts (excluding the d.c. term). Because of the rotation, a 
different subset of the watermark is inserted into neighbor- 
ing 8x8 blocks. Finally, having completed the watermark 
insertion, the MPEG encoder is able to proceed with the 
subsequent stages of compression. 

Note thai the watermark may also be inserted after the 
MPEG quanli2ation stage to reduce distortion of the water- 
mark. MPEG -2 performs a convenient one dimension vec- 
torization called "zigzagging", which allows a simple 3x1 
box average to be performed on the coefficients (excluding 
the d.c. term). 

In practice, performance was improved if the averaging is 
performed using the 2 four-connected coefficients closest to 
the d.c. term, as illustrated in FIG. 7, i.e. the two coefficients 
above and to the left. 

Watermark detection begins by first extracting the PN 
noise sequence from each 8x8 block using Equation l.For 
each block, the PN sequence is then cyclically shifted in the 
opposite direction by one frequency coefficient, and the 
average over all the blocks is then computed. In practice, this 
process can be computed incrementally and does not require 
temporary storage of all the extracted watermarks. A 
weighted averaging can also be applied, where the weights 
are determined based on their susceptibility to common 
signal transformations such as low pass filtering. Finally, the 
average watermark is compared with the original PN 
sequence via correlation. The reason for shifting the water- 
mark in the column direction may now be apparent. If the 
image is clipped on an arbitrary block boimdary, then the 
computed average watermark will simply be rotated by an 
amoimt that is a function of the relative location of the 
clipped portion of the image. Correlation can then be per- 
formed on all permutations (typically 11 to 25) of the 
watermark. The output from the correlator with the maxi- 
mum value is then used for decision purposes. The extrac- 
tion stage is depicted in FIG. 4. Taking the maximum 
correlator output over all rotations of the watermark can 
cause the decision circuitry to be noisy. To improve this, the 
watermark is broken into two pieces; a synchronization 
portion is of length K and a verification portion is N-K. 
Then, when the watermark is extracted as before, correlation 
is first performed only on all rotations of the synchronization 
portion of this watermark. The maximum^ correlation output 
is noted, then the verification portion of the watermark is 
rotated by the corresponding amount and a second correla- 
tion is performed on the verification portions of the water- 
marks. This process significantly improves the overall reli- 
ability of the system. In the course of experimentation, it was 
noticed that some watermarks performed better than others 
on the same imagery. This was caused by variation in the 
correlation statistics between the synchronization and veri- 
fication portions of the watermark. Ideally, the two portions 
should have very low correlations. However, in several cases 
where watermarks performed poorly, it was traced to unex- 
pected correlations between the two portions. 

The present invention provides a modification to digital 
watermarking methods in which the original data is required 
for watermark extraction thereby enabling watermarking 
extraction in the absence of an unwatermarked or original 
data. The present invention preferably uses MPEG/JPEG 
coefficients. An image is divided into typically 8x8 block 
subimages or sub regions and each subimage is processed 
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and the results are combined to derive the extracted water- 
mark. The result is extracdon of the watermark with very 
high confidence. 
While the above invention describes improvements to the 

5 prior-art INSERT-WHOLE, INSERT-MPEG-A, 
EXTRACT-WHOLE, and EXTRACT-MPEG-A algorithms, 
it should be apparent to anyone skilled in the art that the 
same improvements may be applied to any algorithm for 
inserting and extracting watermarks in image data. This 

10 more general view of the present invention is shown in 
FIGS. 8 and 9. 

FIG. 8 shows a schematic block diagram of the general 
method for inserting watermarks. This general method 
makes use of a non-block-based watermark insertion 

15 algorithm, which shall be referred to hereafter as the "base 
insertion algorithm**. The watermark encoder 80 converts 
the watermark into a form appropriate for the base insertion 
algorithm. If the base insertion algorithm is that shown in 
FIG. 1, for example, then the watermark encoder 80 oorre- 

20 sponds to the watermark encoder 19, which comprises the 
error correction encoder 10, the PN mapper 11, and the 
spectral transformer 12. However, if a different base inser- 
tion algorithm is to be used, then the watermark encoder 80 
may perform a different transformation of the watermark. 

25 The encoded watermark signal from watermark encoder 80 
is provided as an input to watermark scgmcnter 81, which 
divides the watermark into a set of subwatennarks. Any 
portion of the original watermark might appear redundantly 
in several of the resulting subwatermarks. The data to be 

30 watermarked is provided as an input to data segmeoter 82, 
which divides the data into subregions. Each subwatermark 
is inserted into a respective data subregion by a watermark 
inserter 83a, S3b, etc. The watermark inserters implement 
the base insertion algorithm, so, if the base insertion algo- 

35 rithm is that shown in FIG. 1, then each watermark inserter 
83fl, S3b, etc., corresponds to the watermark inserter 18, 
which comprises a spectral transformer 13, a spectral shapcr 
14, a delay 15. a summer 16, and an inverse transform 17. 
However, if a different base insertion algorithm is to be used, 

40 then, the watermark inserters 83a, 83/>, etc., may employ a 
different method of inserting subwatermarks into the subre- 
gions of the data to be watermarked. The outputs from the 
watermark inserters are assembled in data combiner 84 to 
provide watermarked data. 

45 FIG. 9 shows a schematic block diagram of the corre- 
sponding general extraction algorithm. The algorithm makes 
use of a "base extraction" algorithm that corresponds to the 
base insertion algorithm used in inserting the watermark into 
the data to be watermarked (FIG. 8). The algorithm in FIG. 

50 9 is substantially the same as the algorithm shown in FIG. 
4, except that, in the general case, the spectrum nomializers 
41(3j etc. are replaced by watermark extractors 91a, etc., 
which implement the base extraction algorithm. That is, if 
the base insertion algorithm used was the algorithm shown 

55 in FIG. 1, then the watermark extractors 91fl, etc., in RG. 9 
will be the spectrum normalizcrs 41a, etc. in FIG. 4. 

While there has been described and illustrated a system 
for inserting a watermark into and extracting a watermark 
from watermarked data without using an unwatermarked 

60 version of the data, it will be apparent to those skilled in the 
art that variations and modifications are possible without 
deviating from the broad principles and teachings of the 
present invention which sbaU be limited solely by the scope 
of the claims appended hereto. 

65 What is claimed is: 

1. A method for inserting a watermark signal into data to 
be watermarked comprising the steps of: 
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dividing data to be watermarked into a plurality of sub- 
regions; 

computing frequency coefficients of the data to be water- 

matiecd in each subregion; 
spread spectrum modulating a watermark signal to be 

inserted by mapping the watermark signal into a PN 

(pseudo- random noise) sequence; 
spectral shaping the PN sequence as a fuoction of the 

average power in each frequency coefficient of the data; 

aod 

inserting each spectral shaped PN sequence into prede- 
termined coefficients in the data in each subregion. 

2. A method for inserting a watermark signal iato data to 
be watermarked as set forth in claim 1, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

3. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where said frequency 
coefficients are DCT (discrete cosine transform) coefficients. 

4. A oKthod for inserting a watermark signal into data to 
be watermarked as set forth in claim 3, where each subre- 
gion is a 8x8 block of pixels. 

5. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 4, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

6. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where each subre- 
gion is a 8x8 blodc of pixels. 

7. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 6, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

8. A method for inserting a watermark signal into data to 
be watermariced as set forth in claim 6, where the frequency 
coefficients of the watermark signal are rotated prior to 
inserting of each spectral shaped PN sequence into the 
subregion. 

9. A method for inserting a watermark signal into data to 
be watermarked as set forth in d aim where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

10. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 8, where only a subset 
of the watermark signal frequency coefficients is inserted 
into any one subregion. 

11. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 10, where the water- 
mark signal comprises a synchronization portion and a 
verification portion. 

12. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 11, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

13. A method for inserting a watermark signal into data to 
be watermarked as set forth m claim 11, where the synchro- 
nization portion and the verification portion have very little 
correlation between each other. 

14. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1. where the spectral 
shaping as a function of the average power is typically 3x1 
window of the coefficient obtained from the one- 
dimensional vectorization by zigzagging of two-dimension 
frequency coefficients. 

15. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where the spectral 
shaping is a function of the average power based on the two 
four-connected frequency coefficients closest to the DC 
term. 



16. A method of extracting a watermark from water- 
marked data comprising the steps of: 

receiving subnegions of watermarked data; 
spectrum norma Uzing the watermarked data as a function 
s of the average power in each frequency coefficient of 

the watermarked data in each subtcgion to gCQcrate 

respective normalized signals; 
combining the respective normalized signals from each 

subregion to generate a single watermark; 
10 conelating the single watermark with predetermined PN 

(pseudo-random noise) sequences corresponding to 

predetermined symbols to provide correlated signals 

for each predetermined PN sequence in each subregion; 
deciding which correlated signal is most likely a current 
15 symbol; and 

extracting a sequence of most likely current symbols 

corresponding to the watermark. 

17. A method of extracting a watermark from water- 
marked data as set forth in claim 16, where the subregions 
are 8x8 blocks used for MPEG encoding and decoding. 

18. A method of extracting a watermark from water- 
marked data as set forth in claim 17, where said combining 
the normalized signals from each subregion to generate a 
single watermark, including removing the relative rotation 
of the watermark between blocks. 

19. A method of extracting a watermark from water- 
marked data as set forth in claim 18, further comprising 
subsequently reconstructing the watermark from partial 
watermarks inserted into each block. 

20. A method of extracting a watennaik from watcr- 
marked data as set forth in claim 19. further comprising 
weighting the watermark coefficients based on their location 
within the frequency spectrum, where the weighting is a 
function of the susceptibility of each frequency coefficient to 
common signal transformations. 

21. A method of extracting a watermark from water- 
marked data as set forth in claim 16, further comprising 
correlating with all rotational shifts of the extracted water- 
mark and selecting the maximum value. 

22. A method of extracting a watermark from waler- 
^ marked data as set forth in claim 16. further comprising 

correlating with all rotational shifts of a synchronization 
portion of a watermark to determine a maximum value and 
subsequently rotating a verification portion of the watermark 
by the same amount as the synchronization portion is rotated 
to obtain the maximum value prior to correlating between 
the verification portion and predetermined PN sequences. 

23. A method of exUacting a watermark from water- 
marked data comprising the steps of: 

receiving subregions of watermarked data; 
spectrum normalizing the watermarked data as a function 
of the average power in each frequency coefficient of 
the watermarked data in each subregion to generate 
respective normalized signals; 
correlating the respective normalized signals with prede- 
termined PN sequences corresponding to predeter- 
mined symbols to provide correlated signals for each 
predetermined PN sequence in each subregion; 
deciding which correlated signal is most likely a current 
gQ symbol in each subregion for providing an extracted 
symbol stream; 
error correcting the extracted symbol stream; and 
extracting a sequence of most likely current symbols 
corresponding to the watermark. 
65 24. A method of extracting a watermark from water- 
marked data as set forth in claim 23, where said error 
correction is Reed Solomon error correction. 



35 
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25. A method for inserting a watermark signal into data to 
be watermarked comprising the steps of: 

dividing data to be watermarked into a plurality of sub- 
regions; 

dividing a watermark signal into a plurality of subwater- 
marks where portions of the watermark are contained in 
more than one subwatermark; and 

inserting said plurality of subwatermarks into said plu- 
rality of subregions. 

26. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 25, where each sub- 
watermark is inserted into a respective subregion, so that 
each sub region contains at least one subwatermark. 

27. A method for extracting a watermark signal from 
watermarked data comprising the steps of: 

receiving a plurality of subregions of watermark data; 
extracting a subwatermark from each subregion of said 
plurality of subregions; and 
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combining and averaging the subwatermarks exlraaed 
from all the subregions to obtain a signal commensu- 
rate with the watermark agnal. 
28. A method for exuactiog a watermark signal from 
watermarked data as set forth in claim 27, further compris- 
ing the steps of: 

dividing the signal commensurate with the watermark 
signal into a plurality of symbol signals; 

I 

correlating each symbol signal with a set of predeOned 
signals; 

determining which predefined signal best corresponds to 
each symbol signal; and 

conabining the best corresponding predetermined signals 
to generate the watermark signal. 

« * * * ♦ 
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[57] ABSTRACT 

Digital watermarking of audio, image, video or multimedia 
data is achieved by inserting the watermark into the percep- 
tually significant components of a decomposition of the data 
in a manner so as to be visually imperceptible. Id a preferred 
method, a frequency spectral image of the data, preferably a 
Fourier transform of the data, is obtained. A watermark is 
inserted into perceptually significant components of the 
fi-equency spectral image. The resultant watermarked spec- 
U-al image is subjected to an inverse transform to produce * 
watermarked data. The watermark is extracted from water- 
marked data by first comparing the watermarked data with 
the original data to obtain an exu-acted watermark. Then, the 
original watermark, original data and the extracted water- 
mark are compared to generate a watermark which is ana- 
lyzed for authenticity of the watermark. 
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SECURE SPREAD SPECTRUM 
WATERMARKING FOR MULTIMEDIA DATA 

This application is a contiDuatioa of applicatioa Ser. No. 
08/534,894, filed Sep. 28, 1995, now abandoned. 

HELD OF THE INVENTION 

The present invention cooccras a method of digital water- 
marking for use in audio, image, video and multimedia data 
for the purpose of authenticating copyright ownership, iden- 
tifying copyright infringers or transmitting a hidden mes- 
sage. Specificaily, a watermark is inserted into the percep- 
tually most significant components of a decomposition of 
the data in a manner so as to be virtually imperceptible. 
More specifically, a narrow band signal representing the 
watennark is placed in a wideband channel that is the data. 

BACKGROUND OF THE INVENTION 

The proliferation of digitized media such as audio, image 
and video is creating a need for a security system which 
facilitates the identification of the source of the material. The 
need manifests itself in terms of copyright enforcement and 
identification of the source of the material. 

Using conventional cryptographic systems permits only 
valid keyholder access to encrypted data, but once the data 
is encrypted, it is ool possible to maintain records of its 
subsequent representation or transmission. Conventional 
cryptography therefore provides minimal protection against 
data piracy of the type a publisher or owner of data or 
material is oonfironted with by unauthorized reproduction or 
distribution of such data or material. 

A digital watermark is intended to complement crypto- 
graphic processes. The watermark is a visible or preferably 
an invisible identification code that is permanently embed- 
ded in the data. That is, the watermark remains with the data 
after any decryption process. As used herein the terms data 
and material will be understood to refer to audio (speech and 
music), images (photographs and graphics), video (movies 
or sequences of images) and multimedia data (combinations 
of the above categories of materials) or processed or com- 
pressed versions thereof. These terms are not intended to 
refer to ASCII representations of text, but do refer to text 
represented as an image. A simple example of a watermark 
is a visible "seal" placed over an image to identify the 
copyright owner. However, the watermark might also con- 
tain additional information, including the identity of the 
purchaser of the particular copy of the image. An effective 
watermark should possess the following properties: 

1. The watermark should be perceptually invisible or its 
presence should not interfere with the material being pro- 
tected. 

2. The watermark must be difficult (preferably virtually 
impossible) to remove from the material without rendering 
the material useless for its intended purpose. However, if 
only partial knowledge is known, e.g. the exact location of 
the watermark within an image is unknown, then attempts to 
remove or destroy the watermark, for instance by adding 
noise, should result in severe degradation in data fidelity, 
rendering the data useless, before the watermark is removed 
or lost. 

3. The watermark should be robust against collusion by 
multiple individuals who each possess a watermarked copy 
of the data. That is, the watermark should be robust to the 
combining of copies of the same data set to destroy the 
watermarks. Also, it must not be possible for coUuders to 
combine each of their images to generate a different vaUd 
wateraiark. 
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4. The watermark should still be retrievable if common 
signal processing operations are applied to the data. These 
operations include, but are not limited to digital-to-analog 
and analog-to-digital conversion, resampUng, requantization 

5 (including dithering and recompression) and common signal 
enhancements to image contrast and color, or audio bass and 
treble for example. The watermarks in image and video data 
should be immune from geometric image operations such as 
rotation, translation, cropping and scaling. 

iQ 5. The same digital watermark method or algorithm 
should be applicable to each of the different media under 
consideration. This is particularly useful in watermarking of 
multimedia material. Moreover, this feature is conducive to 
the implementation of video and image/video watermarking 

J 5 using common hardware. 

6. Retrieval of the watermark should unambiguously 
identify the owner. Moreover, the accuracy of the owner 
identification should degrade gracefully during attack. Sev- 
eral previous digital watermarking methods have been pro- 

20 posed. L. F. Turner in patent number W089/08915 entided 
"Digital Data Security System" proposed a method for 
inserting an identification string into a digital audio signal by 
substituting the "insignificant** bits of randomly selected 
audio samples with the bits of an identification code. Bits are 

25 deemed "insignificant'* if their alteration is inaudible. Such 
a system is also appropriate for two dimensional data such 
as images, as discussed in an article by R. G. Van Scfayndel 
et al entitled "A digital watermark" in Intl. Gonf on Image 
Processing, vol 2, Pages 86-90, 1994. The Turner method 

30 may easily be circumvented. For example, if it is known that 
the algorithm only affects the least significant two bits of a 
word, then it is possible to randomly flip all sudi bits, 
thereby destroying any existing identification code. 
An article entitled "Assuring Ownership Rights for Digi- 
ts tal Images" by G. Caronni, in Proc. Reliable IT Systems, 
VIS *95, 1995 suggests adding tags — small geometric 
pattems-to-digitized images at brightness levels that are 
imperceptible. While the idea of hiding a spatial watermark 
in an image is fundamentally sound, this scheme is suscep- 

40' tible to attack by filtering and redigitization. The fainter such 
watermarks are, the more susceptible they are to such attacks 
and geometric shapes provide only a limited alphabet with 
which to encode information. Moreover, the scheme is not 
apphcable to audio data and may not be robust to common 

45 geometric distortions, especially cropping. J. Brassil ct al in 
an article entitled "Electronic Maridng and Identification 
Techniques to Discourage Document Copying" in Proc. of 
Infocom 94, pp 1278-1287, 1994 propose three methods 
appropriate for document images in which text is common. 

50 Digital watermarks are coded by: (l)vertically shifting text 
lines, (2) horizontally shifting words, or (3) altering text 
feamres such as the vertical endlines of individual charac- 
ters. Unfortunately, all three proposals are easily defeated, as 
discussed by the authors. Moreover, these acchniques arc 

55 restricted exclusively to images containing text. 

An article by K. Tanaka et al entitled "Embedding Secret 
Information into a Dithered MuUi- level Image" in IEEE 
Military Comm. Conf., pp216-220, 1990 and K. Mitsui ct al 
in an article entitled " Video-Sleganography" in IMA Intel- 

60 lectual Property Proc, vl. pp 187-206, 1994, describe sev- 
eral watermarking schemes that rely on embedding water- 
marks tbat resemble quantization noise. Their ideas hinge on 
the notion that quantization noise is typically imperceptible 
to viewers. Their first scheme injects a watermark into an 

65 image by using a predetermined data stream to guide level 
selection in a predictive quantizer. The data stream is chosen 
so that the resulting watermark looks like quantization noise. 
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A varialioa of this scheme is also presented, where a 
watenDark id the form of a ditberii^ mairix is used to dilber 
ao image in a certain way. There are several drawbacks to 
these schemes. The most importaot is that they are suscep- 
tible to signal processing, especially requantizatioOt and 5 
geometric attacks such as cropping. Furthermore, ihcy 
degrade ao image in the same way that predictive coding and 
dithering can. 

In Tanaka et al, the authors also propose a scheme for 
watermarking facsimile data. This scheme shortens or 10 
lengthens certain runs of data in the run length code used to 
generate the coded fax image. This proposal is susceptible to 
digital-lo-analog and analog-to digital conversions. In 
particular, randomizing the least significant bit (LSB) of 
each pixel's intensity will completely alter the resulting run 
length encoding. Tanaka et al also propose a watermarking 
method for "color-scaled picture and video sequences". This 
method applies the same signal transform as JPEG (DCT of 
8x8 sub-blocks of an image) and embeds a watermark in the 
coefficient quantization module. While being compatible 20 
with existing transform coders, this scheme is quite suscep- 
tible to rcquantization and filtering and is equivalent to 
coding the watermark in the least significant bits of the 
transform coefficients. 

In a recent paper, by Macq and Quisquater entitled 
"Cryptology for Digital TV Broadcasting" in Proc. of the 
IEEE, 83(6), pp944-957, 1995 there is briefly discussed the 
issue of watermarking digital images as part of a general 
survey on cryptography and digital television. The authors 
pmvide a description of a procedure to insert a watermark 
into the least significant bits of pixels located in the vicinity 
of image contours. Since it relies on modifications of the 
least significant bits, the watermark is easily destroyed. 
Further, the method is only applicable to images in that it 
seeks to insert the watermark into image regions that lie on 
the edge of contours, 

W. Bender et al in article entitled "Techniques for Data 
Hiding" in Proc. of SPIE, v2420, page 40, July 1995, 
descril>e tw^^ watermarking schemes. The first is a statistical ^ 
method" called "Patchwotic". Patdiwork randomly chooses n 
pairs of image points (a^, b^) and increases the brightoess at 
a,, by one unit while corre^ndingly decreasing the bright- 
ness of b,. The expected value of the sum of the differences 
of the n pairs of points is claimed to be 2n, provided certain 
statistical properties of the image art true. In particular, it is 
assumed that all brightness levels are equally likely, that is, 
intensities are uniformly distributed. However, in practice, 
this L5 very uncommon. Moreover, the scheme may not be 
robust to randomly jittering the intensity levels by a single 
unit, and be extremely sensitive to geometric affine trans- 
formations. 

The second method is called "texture block coding", 
where a region of random texture pattern foupd.in the image 
is copied to an area of the image with similar texture. 55 
Autocorrelation is then used to recover each texture region. 
The most significant problem with this technique is that it is 
only appropriate for images that possess large areas of 
random texture. The technique could not be used on images 
of text, for example. Nor is there a direct analog for audio. 50 

In addition to direct work on watermarking images, there 
are several works of interest in related areas. E. H. AdeUoo 
in U.S. Pat. No. 4,939,515 entitled "Digital Signal Encoding 
and Decoding Apparatus'* describes a technique for embed- 
ding digital information in an analog signal for the purpose 6S 
of inserting digital data into an analog TV signal. The analog 
signal is quantized into one of two disjoint ranges ({0.2,4 . 
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. . }, {1,3,5}, for example) which are selected based on the 
binary digit to be transmitted. Thus Adeison's method is 
equivalent lo watermark schcnjcs that encode information 
into the least significant bits of tbe data or its transform 
coefficients. Adelson recognizes that the method is suscep- 
tible to noise and therefore proposes an alternative scheme 
wherein a 2x 1 Hadamard traissform of tbe digitized analog 
signal is taken. The differential coefficient of the Hadamard 
transform is oQset by 0 or 1 unit prior to computing the 
inverse transform. This correspoixls to encoding the water- 
mark into the least significant bit of the differential coeffi- 
cient of the Hadamard transform. It is not clear that this 
approach would demonsU'ate enhanced resilience to mise. 
Furthermore, like all such least significant bit schemes, an 
attacker can eliminate the watermark by randomization. 

U.S. Pat. No. 5,010,405 describes a method of interleav- 
ing a standard NTSC signal within an enhanced definition 
television (EDTV) signal. This is accprnplished by analyz- 
ing the frequency spectrum of the EDTV signal (larger than 
that of the ^4TSC signal) and decomposing it into three 
sub-bands (LMM for low. medium and high frequency 
respectively). In contrast, the NTSC signal is decomposed 
into two subbands, L and M. The coefficients, Mj^, within the 
M band are quantized into M levels and the high frequency 
coefficients, Hj^, of the EDTV signal are scaled such that the 
addition of the H^^ signal plus any noise present in the system 
is less than the minimum separation between quantization 
levels. Once more, the method relies on modifying least 
significant bits. Presumably, the mid-range rather than low 
frequencies were chosen because they are less perceptually 
significant. In contrast, the method proposed in the present 
invention modifies the most perceptually significant com- 
ponents of the signal. 

Finally, it should be noted that many, if not all, of the prior 
art protocols are not collusion resistant 

Recently, Digimarc Corporation of Portland, Oreg., has 
described work referred to as signature technology for use in 
identifying digital intellectual property. Their method adds 
or subtracts small random quantities from each pixels. 
Addition or subtraction is based on comparing a binary mask 
of N bits with the least significant bit (LSB) of each pixel. 
If the LSB is equal to the corresponding mask bit, then the 
random quantity is added, otherwise it is subtracted. The 
watermark is extracted by first computing the difference 
between the original and watermarked images and then by 
examining the sign of the difference, pixel by pixel, lo 
determine if it corresponds to the original sequence of 
additions/subtractions. The Digimarc technique is not based 
on direct modifications of the image spectrum and does not 
make use of perceptual relevance. While the technique 
appears to be robust, it may be susceptible to constant 
brightness ofi&ets and to attacks based on expbiting the high 
degree of local correlation present in an image. For example, 
randomly switching the position of similar pixels within a 
local neighborhood may significantly degrade the water- 
mark without damaging the image. 

In a paper by Koch, Rindfrcy and Zhao entitled "Copy- 
right Protection for Multimedia Data'*, two general methods 
for watermarking images are described. The first method 
partitions an image into SxS blocks of pixels and computes 
the Discrete Cosine Transform (DCT) of each of these 
blocks. A pseudorandom subset of the blocks is chosen and 
in each such block a triple of frequencies selected from one 
of 18 predetermined triples is modified so that their relative 
strengths encode a I or 0 value. The 18 possible triples are 
composed by selection of three out of eight predetermined 
frequencies within the 8x8 DCT block. The choice of the 
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eight firequencies to be altered within the DCT blcx;k appears 
to be based on the belief that middle frequencies have a 
moderate variance level, i.e., ibey have similar magnitude. 
This property is needed in order to allow the relative strength 
of the frequency triples to be altered without requiring a S 
modification that would be perceptually noticeable . Unlike 
in the present invention, the set of frequencies is not chosen 
based on any perceptual significance or relative energy 
considerations. In addition, because the variance between 
the eight frequency ooefiScients is small, one would expect lO 
that the technique may be sensitive to noise or distortions. 
This is supported by the experimental results reported in the 
Koch et al paper, supra, where it is reported that the 
embedded labels are robust against JPEG compression for 
a quality factor as low as about 50%*'. In contrast, the 15 
method described in accordance with the teachings of the 
present invention has been demonstrated with comprcssioo 
quality factors as low as 5 percent. 

An earlier proposal by Koch and Zhao in a paper entitled 
"Toward Robust and Hidden Image Copyright Labeling** 20 
proposed not triples of frequencies but pairs of frequeacics 
and was again designed specifically for robustness to JPEG 
compression. Nevertheless, the report states that '*a lower 
quality factor will increase the likelihood that the changes 
necessary to superimpose the embedded code on the signal 
will be noticeably vtsible". 

In a second method, proposed by Koch and Zhao, 
designed for black and white images, no frequency trans- 
form is employed Instead, the selected blocks are modified 
so that the relative frequency of white and blade pixels 
encodes the final value. Both watermarking procedures are 
particularly vulnerable to multiple document attacks. To 
protect against this, ^ao and Koch proposed a distributed 
8x8 bk>ck of pixels created by randomly sampling 64 pixels 
from the image. However, the resulting DCT has no rela- 
tionship to that of the true image. Consequently, one would 
expect such distributed blocks to be both sensitive to noise 
and likely to cause noticeable artifacts in the image. 

In summary, prior art digital watermarking, techniques are ^ 
not robust and the watermaiic is easy to remove. In addition, 
many prior techniques would not survive common signal 
and geometric distortions 

SUMMARY OF THE INVENTION 

45 

The present invention overcomes the limitations of the 
prior art methods by providing a watermarking system that 
embeds an unique identifier into the percepmally sigaificaot 
components of a decomposition of an image, an audio signal 
or a video sequence. 50 

Preferably, the decomposition is a spectral fi-equency 
decomposition. The watermark is embedded in the data's 
perceptually significant frequency components. This is 
because an effective watermark cannot be located in -per- • 
ceptually insignificant regions of image data or in its f re- 55 
quency spectrum, since many common signal or geometric 
processes affect these components. For example, a water- 
mark located in the high frequency spectral components of 
an image is easily removed, with minor degradation to the 
image, by a process that performs low pass filtering. The 60 
issue then becomes one of bow to insert the watermark into 
the most significant regions of the data fi'equency spectrum 
without the alteration being noticeable to an observer, i.e., a 
human or a machine feature recognition system. Any spec- 
tral component may be altered, provided the alteration is 65 
small. However, very small alterations are susceptible to any 
noise present or intentional distortion. 
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In order to overcome this problem, the frequency domain 
of the image data or sound data may be considered as a 
communication channel, and correspondingly the water- 
mark may be considered as a signal transmitted through the 
channel. Attacks and intentional signal distortions are thus 
treated as noise from which the transmitted signal must be 
immune. Attacks are intentional efforts to remove, delete or 
otherwise overcome the beneficial aspects of the data water- 
marking. While the present invention is intended to embed 
watermarks in data, the same methodology can be applied to 
sending any type of message through media data. 

Instead of encoding the watermark into the least signifi- 
cant components of the data, the present invention considers 
applying concepts of spread spectrum oommuiucatioo. In 
spread spectrum commimications, a narrowband signal is 
transmitted over a much larger bandwidth such that the 
signal energy present in any single frequency is impercep- 
tible. In a similar manner, the watermark is spread over 
many frequency bins so that the energy in any single bin is 
small and imperceptible. Since the watermark verification 
process includes a prion knowledge of the locations and 
content of the watermarks, it is possible to concentrate these 
many weak signals into a single signal with a high signal 
to-noisc ratio. Destruction of such a watermark would 
require noise of high amplitude to be added to every 
frequency bin. 

In accordance with the teachings of the present invention, 
a watermark is inserted into the perceptually most significant 
regions of the data decomposition. The watermark itself is 
designed to appear to be additive random noise and is spread 
throughout the image. By placing the watermark into the 
perceptually significant components, it is much more diffi- 
cult for an attacker to add more noise to the components 
without adversely affecting the image or other data. It is the 
fact that the watermark looks like noise and is spread 
throughout the image or data which makes the present 
scheme appear to be similar to spread spectnun methods 
used in communications system. 

Spreading the watermark throughout the spcctmm of an 
image ensures a large measure of security against uninten- 
tional or intentional attack. First, the location of the water- 
mark is not obvious. Second, frequency regions arc selected 
in a fashion that ensures severe degradation of the original 
data following any attack on the watermark. 

A wateruQark that is well placed in the frequency domain 
of an image or a sound track will be practically impossible 
to see or hear. This will always be the case if the energy in 
the watermark is sufiBciently small in any single frequency 
coefiBcienl. Moreover, it is possible to increase the energy 
present in particular frequencies by exploiting knowledge of 
masking phenomena in the human auditory and visual 
systems. Percepmal masking refers to any situation where 
information in certain regions of an image or a sound is 
occluded by perceptually more prominent information in 
another part of the image or sound. In digital waveform 
coding, this frequency domain (and in some cases, time/ 
pixel domain) masking is exploited extensively to achieve 
low bit rate encoding of data. It is clear that both auditory 
and visual systems attach more resolution to the high energy, 
low frequency, spectral regions of an auditory or visual 
scene. Further, spectrum analysis of images and sounds 
reveals that most of the information in such data is often 
located in the low frequency regions. 

In addition, particularly for processed or compressed data, 
perceptually significant need not refer to htmian perceptual 
significance, but may refer instead to machine perceptual 
significance, for instance, machine feature recognition. 
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To meet these requirements, a watermark is proposed DETAILED DESCRIPTION 

whose structure comprises a large quantity, for instance onjcr to better understand the advantages of the 

1000, of randomly generated numbers with a normal distri- invention, the preferred embodiment of a frequcDcy spec- 

butioD having zero mean and unity variance. A binary trum based watermarking system will be described. It is 
watermark is not chosen because it is much less robust to 5 insiniciive to examine die processing stages that image (or 

attacks based on collusion of several independently water- sound) data may undergo in the copying process and to 

marked copies of an image. However, generally, the water- consider the effect that such processing stages can have on 

mark might have arbitrary structure, both deterministic the data. Referring to FIG. 1, a watermarked image or sound 

and/or random, and including uniform distributions. The data 10 is transmitted 12 to undergo typical distortion or 
length of the proposed watermark is variable and can be to intentional tampering 14. Such distortions or tampering 

adjusted to suit the characteristics of the data. For example, includes lossy compression 16. geometric distortion 18, 

longer watermarks might be used for images that are espe- signal processing 20 and Of A and A/D conversion 22. After 

cially sensitive to large modifications of its spectral undergoing distortion or umpering, corrupted watermarked 

coefficients, thus requiring weaker scahng factors for indi- ^^^^ ^"^l^ ^^^^ ^4 is transmitted 26 The process of 
vidual com nents j5 transmission refers to the applicaUoD of any source or 

^ ^ channel code and/or of encryption techniques to the data. 

The watermark is then placed in componentsof the image ^^i^ most transmission steps are information lossless, 

spectrum. These components may be chosen based on an ^jany compression schemes (e.g., JPEG, MPEG, etc.) may 

analysis of those components which are most vulnerable to potentially degrade the quality of the dau through irretriev- 

aitack and/or which are most percepmaUy significant. This ^^[^ ^j^j^ general, a watcrmaridng method should 
ensures that the watermark remains with the image even 20 resilient to any distcHtions introduced by transmission or 

after common signal and geometric distortions. Modification compression algorithms. 

of these spectral components results in severe image deg- ^o^y compression 16 b an operation that usuaUy climi- 
radation long before the watermark itself is destroyed. Of ^^^^ perceptually irrelevant components of image or sound 
course, to insert the watermark, it is necessary to alter these ^^^^ ^^^^ preserve a watermaiic when undergoing 
very same coefficients. However, each modification can be ^5 ^^^^ compression, the watermark is located in a perccptu- 
extremely small and, in a manner similar to spread spectrum significant region of the data. Most processing of this 
communication, a strong narrowband watermark may be ^ fr^uency domain. Data loss usuaUy 
distributed over a much broader unage (channel) spectrum. ^ frequency components. Thus, the water- 
Conceptually, detection of the watermark then proceeds by pj^^^j ^ ^ significant frequency component 
adding all of these very small signals, whose locaions are ^^^^ ^^^^ ^^^^ spectrum to minimize the 
only known to the copyright owner, and concentratmg the ^^^^ y^^^ compression, 
watermark into a signal with high signal-to-noise ratio. ^^^^ ^ encounter many common 
Because the ocation of the watermark is only loiown to the transformatioc^ that are broadly categorized as geometric 
copyright holder, an attacker wodd have to add very much distortions or signal distortions. Geometric distortions 18 are 
more noise energy to each spectral coefficient m order to be .g^ ^ ^ ^ ^^^^ _ 
confident of removing the watermark. However, this process ^ ^^^^^^^^ translation, scaling and cropping. By 
would destroy the image. manuaUy determining a minimum of four or nine corre- 

Prcferably, a predetermined number of the largest coef- spending points between the original and the distorted 
ficicnts of the OCT (discrete cosine transform) (excluding ^ watermark, it is possible to remove any two or three dimen- 

the DC term) are used. However, the choice of the DCT is s-Qnal affine transformation. However, an affioe scaling 

not critical to the algorithm and other spectral transforms, (shrinking) of the image results in a loss of data in the high 

including wavelet type decompositions are also possible. In frequency spectral regions of the image. Cropping, or the 

fact, use of the FFT rather than DCT is preferable from a cutting out and removal of portions of an image, also results 

computational perspective. ^ irretrievable loss of data. Cropping may be a serious threat 

The invention will be more clearly understood when the to any spatially based watermaric but is less likely to affect 

following description is read in conjunction with the accom- a frequency-based scheme. 

panying drawing. Common signal distortions include digital-to-analog and 

analofi-to-dieital conversion 22, resampling, requantization, 

BRIEF DESCRIPTION OF, THE DRAWING j^^, J^^g Withering and recompression, and common signal 

FIG. I is a schematic representation of typical common enhancements to image contrast and/or color, and audio 

processing operations to which data could be subjected; frequency equalization. Many of these distortions arc noo- 

r-,^ » . . „ . f« io..ot-«, linear, and it is difficult to analyze their effect in either a 

FIG. 2 is a schematic representation of a preferred system - , r u j j n »u i- . .K«t 

f . . t • . • spatial or frequency based method.. However, the fact that 

for immersmg a watermark into an unage *^ . . , /* , • - i . „r^ 
„ ° , ^. . 55 the origmal unage is known allows many signal translor- 

FIGS. 3fl and 3b are flow charts of the encoding and ^^^^^^ ^^^^^^^ j^^^^ approximately. For example, 

decoding of watermarks; histogram equalization, a common oon-hnear contrast 

no. 4 is a graph of the responses of the watermark enhancement method, may be substantially removed by 

detector to random watermarks; histogram specification or dynamic histogram warping tech- 

FIG. 5 is a graph of the response of the watermark niqucs. 

detector to random watermarks for an image which is Finally, the copied image may not remain in digital form, 

successively watermarked five times; Instead, it is likely to be printed or an analog recording made 

FIG. 6 is a graph of the response of the watermark (analog audio or video tape). These reproductions introduce 

detector to random watermarks where five images, each additional degradation into the image data that a watermark- 
having a different watermark, and averaged together; and 65 ing scheme must be robust to. 

FIG. 7 is a schematic diagram of an optical embodiment Tampering (or attack) refers to any intentional attempt to 

of the present invention remove the watermark, or corrupt it beyond recognition. The 
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watermark must not only be resisunt to the inadvertent jc.W*?*""*) (3) 

application of distortions. It must also be immune to inten- Equation 1 is invcrtiblc. Equations 2 and 3 arc invcrtiblc 

tional manipulation by malicious parties. These manipula- ^^en X.-.-0. Therefore, given X* it is possible to compute the 

tions can include combinations of distortions, and can also inverse' function necessary to derive W* from X and X*. 

include collusion and forgery attacks. 5 Equation 1 is not the preferred formula when the values 

FIG. 2 shows a preferred system for inserting a watermark x.^ vary over a wide range. For example, if then 

into an image in the frequency domain. Image data X(iJ) adding 100 may be insufficient to establish a watermark, but 

assumed to be in digital form, or alternatively data in other if x,«»10, then adding 100 wUl unaccepiably distort the value. 

formats such as photographs, painting-; or the like, that have Insertion methods using equations 2 and 3 are more robust 

been previously digitized by well-known methods, is subject 10 when encountering such a wide range of values x,-. It will 

to a frequency transformation 30, such as the Fourier trans- also be observed that equation 2 and 3 yield simUar results 

form. A watermark signal W (k) is inserted into the fre- when aw, is smaU. Moreover, when x, is posiuve, equation 

quency spectrum components of the transformed image data ^ ^ equivalent to ln(x,>ln(x,)+ax.. and may be considered 

32 applying the techniques described below. The frequency apphcation of cquaUon 1 when natural logaiitos of 

spectmm image data including the watermark signal is 15 ^ ih 1 ^ 

*Y. J, ^- r f i>t u- „ •« a-0.0 1, then usmg Equation (2) guarantees that the spectral 

subjected to an mverse frequency transform 34, resulung in J^^^ ^^^^ ^^^^ - ^ 

watemiarked image data which may retrain in digital ^^^^ applications, a single scaling parameter a may 

form or be prmted as an anatog representation by weU- for combining aU values of x,. Therefore, 

known methods. multiple scaling parameters a, , a„ can be used with 

After applymg a frequency transformation to the unage 20 ^^^^ equations 1 to 3 such as x-x, (l+a.-wj. The values 

data 30, a perceptual mask is computed that highhghts ^^^^ ^ ^ ^j^^-^^ ^^^^^ ^. ^ 

prominent regions m the frequency spectrum capable of ^j^^ ^ ^^^^^^ perceptual quality of the document. A 

supporting the watermark without overly aflfectmg percep- j^^^ ^^j^^ -j ^ possibte to aUer x.- by a 

tual fidehty This may be per&)rmed by using knowledge of j^^g^ ^^^^^ ^-^^^^ perceptually degrading the document, 

the perceptual significance of each frequency 10 the 25 ^ method for selecting the multiple scaling values is 

spectmm, as discussed earlier, or simply by ranking the g^^^^^j assumptions. For example, equa- 

frcqucncies based on their energy. The latter method was tion 2 is a special case of the generalized equation 1, 

used in experiments described below. (x;«x,+a,.xj, for a-cLX... That is, equation 2 makes the 

In general, it is desired to place the watermark in regions reasonable assumption that a large value of x.- is less 

of the spectmm that are least affected by common signal 30 5^0^^^^^ ^ additive alteration that a small value of x,. 

distortions and are most significant to image quality as Generally, the sensitivity of the image to different values 

perceived by a viewer, such that significant modification unknown. A method of empirically estimating the 

would destroy the image fidelity. In practice, these regions sensitivities is to determine the distortwn caused by a 

could be experimentally identified by applying common number of attacks on the original image. For example, it is 

signal distortions to images and examining which frequen- ^5 p^gg^j^jg compute a degraded image D* from D, extract 

cies are most affected, and by psychophysical studies to corresponding values x/, . . . ,x„* and selea a, to be 

identify how much each component may be modified before proportional to the deviation |x,*-x..|. For greater robustness, 

significant changes in the image are perceivable. possible to try other forms of distortion and make 

The watermark signal is then inserted into these promi- proportional to the average value of |x,.*-xj. Instead of using 

hent regions in a way that makes any tampering create ^ (jj^ average distortion," it is possible to use the oKdian or 

visible (or audible) defects in the data. The requirements of maximum deviation, 

the watermark mentioned above and the distortions common Alternatively, it is possible to combine the empirical 

to copying provide constraints on the design of an electronic approach with general global assumptions regarding the 

watermark. sensitivity of the values. For example, it might be required 

In order to better understand the watermarking method, that c^^a^ whenever x,^x^. This can be combined with the 

reference is made to FIGS. 3(fl) and 3(6) where from each empirical approach by setting a, according to 
document D a sequence of values X^x^, . , . ,x„ is extracted 
40 with which a watermark W^w^, . . . ,w„ is combined 42 

to create an adjusted sequence of values X=x*i x'„ ' l/AyS^i 

which is then inserted back 44 into the document in place of 

values X in order to obtain a watermark document D'. An . - - . ^ • , . . - 

attack of the document D', or other distortion, wiU produce ^ more sophisticated approach is to weaken the monotonic- 

a document D*. Having the original document D and the Hy constramt to be robust against occasional outUers. 

document D*, a ' poss^ly corrupted watermark W is TTie length of the watermark, n,. determines the ^ 

extracted 46 and compared to watermark W 48 for statistical which the watermark is spread among the relevant compo- 

analysis 50. The values W* are extracted by first extracting ot the image data. As the size of the watemiark 

a set of values X*=x,*, . . . ,x„* from (using information increases, so does the number of altered spectral 

about D) and then generating W* from the values X* and the components, and the extent to which each component need 

values X altered decreases for the same resilience to noise. Con- 
„„ u • 1 V ♦u- 60 sider watermarks of the form x/^x.-Kxw,- and a white noise 

When combimng the value X wilb lt>e waterrMrk^^^^ ,_ ^^^^ .^'osea .coord.ing to mde- 

W m step 42 scaling paran^eter a is speafied The scakng distributions with standard deviation o. It is 

parameter a dctermmcs the extent lo which values W alter J ^^^^ ^ ^ proporUonal to 

values X. THree preferred formulas for computing X are: .^^ quadrupling the number of com^nents can 

65 halve the magnimde of the watermark placed into each 

x,'-jt,-+aK', (1) component. The sum of the squares of the deviations 

'-i.^i+aw,) (2) remains essentially unchanged. 
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Iq general, a watermark comprises an arbitrary sequence used, but the use of wavelet based schemes are also useable 

of real numbers Waw^, . . . ,w^. In practice, eadi value w^. as a variation. In terms of selecting frequency regions of the 

may be chosen independently from a normal distributtOD transform, it is possible to use models for the perceptual 

N(0,1), where cr) with mean fi and variance or of system under consideration. 

a uniform distribution from {1,-1} or {0,1}. 3 Frequency analysis may be performed by a wavelet or 
It is highly unlikely that the extracted mark W* wiU be sub-band transform where the signal is divided into sub- 
identical to the original watermark W. Even the act of bands by means of a wavelet or multi-resolution transform, 
requantizing the watermarked document for transmission The sub-bands need not be uniformly spaced. Each sub-band 
will cause W* to deviate from W. A preferred measure of the may be thought of as representing a frequency region in the 
similarity of W and W* is 10 domain corresponding to a sub-region of the frequency 

range of ihe signal. The watermark is then inserted into the 

w*w (4) sub-regions. 

sun(W, w ) = -j===^ P^j. audio data, a sliding **window** moves along the 

signal data and the frequency transform (DCT, FFT, etc.) is 

15 taken of ihc sample in the window. This process enables the 

Large values of sim (W,W*) are significant in view of the capture of meaningful iofonnation of a signal that is time 

following analysis. Assume that the authors of document D* varying in nature . 

had no access to W (either through the seller or through a E^^h coefEdent in the frequency domain is assumed to 

watermarked document). Then for whatever value of W* is have a perceptual capacity. That is, it can support the 

obtained, the conditional distribution on w^- will be indcpcn- 20 insertion of additional information without any (or with 

deotly distributed according to N(0,1). In this case, minimal) impact to the perceptual fidelity of the da to. 

In order to place a length L watermark into an NxN image, 
the NxN FFT (or DCT) of the image is computed and the 
watermark is placed into the L highest magnimde coeffi- 

25 cients of the transform matrix, excluding the DC component. 
More generally, L randomly chosen coeflBcients could be 

Thus, sim(W,W*) is distributed according to N(0,1). Then, chosen from the M, M^L most perceptually significant 

one may apply the stondard significance tests for the normal coefGcients of the transform. For most images, these coef- 

distribution. For example, if is chosen independently ficients will be the ones corresponding to the low frequen- 

from W, then it is very unlikely that sim(W,W*)>5. Note that 30 cies. The purpose of placing the watermark in these loca- 

somewhat higher values of sim (W,W*) may be needed tions is because significant tampering with these frequencies 

when a large number of watermarks are on file. The above will destroy the image fidelity or perceived quality well 

analysis required only the independence of W from W*, and before the watermark is destroyed. 

did not rely on any specific properties of W* itself TTiis fact The FFT provkles perceptually similar results to the DCT. 

provides further flexibility when preprocessing W*. 35 This is different than the case of transform coding, where the 

The exU-acted watermark W* may be extracted in several DCT is preferred to the PFT due to its spectral properties, 

ways to potentially enhance the ability to extract a water- The DCT tends to have less high frequency information than 

mark. For example, experiments on images encountered that the FFT, and places most of the image information in the 

instances where the average value of W*, denoted Ej{W*), low frequency regions, making it preferable in situations 

differed substantially from 0, due to the effects of a dithering 40 where data need to be eliminated. In the ' case -'of 

procedure. While this artifact could be easily eliminated as watermarking, image data is preserved, and nothing is 

part of the extraction process, it provides a motivation far eliminated. Thus the FFT is as good as the DCT, and is 

postprocessing extracted watermarks. As a result, it was preferred since it is easier to compute, 

discovered that the simple transformation w,-*<w,-*-EXW*) In an experiment, a visually imperceptible watermark was 

yielded superior values of sim (W,W*), The improved 45 intentionally placed in an image. Subsequently, 100 ran- 

performance resulted from the decreased value of W*. W*; domly generated watermarks, only one of which corrc- 

ihe value of W*. W was only slightly affected. spondcd to the correct watermark, were applied to the 

In experiments it was frequently observed that w,-* could watermark detector described above. The result, as shown in 

be greatly distorted for some values of i. One postprocessing FIG. 4, was a very strong positive response corresponding to 

option is to simply ignore such values, setting them to 0. 50 the correct watermark, suggesting that the method results in 

Tliat is, a very low number of false positive responses and a very low 

false negative response rate, 

if iH-n > lolcrancc ^° another test, the watermarked image was scaled to half 

of its original size. In order to recover the watermark, the 

oiherwisc 55 image was re-scaled to its original size, albeit with loss of 

detail due to subsampLing of the image using tow pass spatial 

The goal ofsuch a transformation is to lower W* WVAlcss operations. Ihc response of the watermark detector 

abrupt version of this approach is to noraiaUze the W* well above random chance levels, suggesUng that the 

valu^ to be either -1,0 or 1, by watermark is robust to geometric distortions. TTiis result was 

60 achieved even though 75 percent of the original data was 
missing from the scaled down image. 

H'.*<sigaCH',*-£,<w^))- In a further experiment, a JPEG encoded version of the 

This transformation can have a dramatic effect on the image with parameters of 10 percent quality and 0 percent 

statistical significance of the result. Other robust statistical smoothing, resulting in visible distortions, was used. The 

techniques could also be used to suppress outlier effects. 65 results of the watermark deleaor suggest that the method is 

In principle, any frequency domain transform can be used. robust to common encoding distortions. Even using a ver- 

In the scheme described below, a Fourier domain method is sion of the image with parameters of the 5 percent quality 
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and 0 percent smoothing, ihe results were weU above lhai In HG. 7, data to be watermarked sucb as an ioiage 52 is 

achievable due to randona chance. passed through a spatial transform Icos 54, sucb as a Fourier 

lo experiments using a dithered version of the image, the transform leas, the output of which lens is the spatial 

response of the watermark: detector suggested that the transform of the image. Concurrently, a watermark image 56 

method is robust to common encoding distortion. Moreover. 5 is passed through a second ^atial transform lens 58, the 

more reliable detection is achieved by removing any noo- output of which lens is the spatial transfer of the watermark 

zero mean from the extracted watermark. image 56. Tbc spatial transform &om lens 54 and the spatial 

In another experiment, the image was clipped, leaving transform from lens 58 are combined at an optical combiner 

only the central quarter of the image. In order to extract the 60. The output of the optical combiner 60 is passed through 

watermark from the clipped image, the missing portion of lO an inverse spatial transform lens 62 from which the water- 

ihe image was replaced with portions from the original mark image 64 is present. The result is a unique, virtually 

unwatermarked image. The watermark detector was able to imperceptible, watermarked image. Similar results are 

recover the watermark with a response greater than random. achievable by transmitting video or multimedia signals 

When the noo-zero mean was removed, and the elements of through the lenses in the manner described above. 

ihc watermark were binarizcd prior to the comparison with 15 While there have been described and illustrated ^read 

the correct watermark, the detector response was improved. spectmm watermarking of data aod variatrons and modifi- 

This result is achieved even though 75 percent of the data cations thereof, it will be apparent to those skilled in the art 

was removed from the- image. that further variations and modifications are possible without 

In yet another experiment, the image was printed, deviating from the broad principles and ^irit of the present 

photocopied, scanned using a 300 dpi Umax PS-2400x 20 invention which shall be limited solely by the scope of the 

scarmer and rescaled to a size of 256x256 pixels. Oearly, the claims appended hereto, 

final image suffered from different levels of distortion intro- What is claimed is: 

duced at each process. High frequency pattern noise was 1. A method of inserting a watermark into data comprising 

particularly noticeable. When the non-zero mean was the steps of: 

removed and only the sign of the elements of the watermark 25 obtaining a spectral decomposition of data to be water- 
was used, tbe watermark detector response improved to well marked which data is a representation of humanly 
above random chance levels. perceivable material; 

In stiU another experiment, the image was subject to five inserting a watermark into the perceptuaUy significant 

successive watermarking operations. That is, the origmal components of the decomposition of data: and 

image was watermarked, the watermarked image was 30 ^ .^^^^ ^^^^^ ^ decomposition of 

watermarked, and so forth. The process may be considered ^^^^^ watermark for generating watermarked 
another form of attack in which it is clear that significant 

image degradation occurs if the proccK is repeated FIG^ ^ ^ ^^^^ ^ watcnnaik into data as set forth 

shows the response of ±e watermaric detea» to 1000 comprises image data. 

randomly generated watermarks mchiding the five water- 35 3 ^ method of inserting a watermark into daU as set forth 

marks pr«ent in the linage. The five domimnt spJces m the ^ ^^^^ comprises video data. 

graph, mdicalivc of the presence of the five watermarks ^ ^ .^^^^ ^ watennark into dau as set forth 

show that successive watermarking docs not mterfcre with ^ ^ ^^^^ comprises audio data. 

the process. 5. A method of inserting a watermark into data as set forth 

The fact that successive-watermarking is possible means 40 ^ ^f^^ comprises multimedS'dlia. 

that the history or pedigree of a document is determmable if ^ Amethod of inserting a watermark into data as set forth 

successive watennarking is added wrth each copy. in claim 1, where said obtaining a spectral decomposition of 

In a variation of the multiple watermark image five ^ consisting of Fourier 

separately watennarked unages were averaged together to ^a„sfon„ation, discrete cosine transfonnation. Hadamard 

simulate simple conclusion atuck. Fia 6 shows the 45 ^^^^^^^^^-^^^ ^.^^^t. multi-resolution, sub+and 

response of the watermark detector to lOOO randomly gen- method 

crated watennarks. including lhc five watennarks present in 7. Amethod of insertinga watermarkinto data assetforth 
the origmal images. Tie result is that simple coUusion based ^^^^ ^^^^^ ^.^ ^ watermark inserts water- 
on averaging is meffect.ve m defeating the present water- ^^^^ ^^j^^^ ^^^^^ ^^^^.^^ additional signal into a 
marking systetn. ■ ^ ^ ^ . percepnially significant component affects the perceived 

The result of the above experiments is that the described nualitv of the cUta 

system can extract a reliable copy of the watermark from g. Amethod of insertinga watermarkinto data asset forth 

images that have been significantly degraded through sev- ^ ^^^^^ comprising: 

eral common ceometnc and signal processing procedures. , [ r l- '- 

^ • I J • /I cr compannc data with watermarked data for oblaimng 

These procedures mclude zoommg (low pass filtering), 55 & 

cropping, lossy JPEG encoding, dithering, printing, photo- extracted data values; 

copying aod subsequent rescanning- comparing extracted data values with watermark values 

While these experiments were, in facU conducted using an and data for obtammg difference values; aod 

image, similar results are attainable with text images, audio analyzing difference values to determine the watermark in 

data and video data, although attention must be paid to the 60 the watermarked data. 

time varying nature of these data. 9. The method of inserting a watermark into data as set 

The above implementation of the watermarking system is forth in claim 8, wheye watermark values include associated 

an electronic system. Since the basic principle of the inven- scaling parameters. 

tion is the inclusion of a watermark into spectral frequency 10. A method of inserting a watennark into daU as set 

components of the data, watermarking can be accomplished 65 forth in claim 9, where scaling parameters are selected such 

by other means using, for example, an optical system as that adding additional watermark value affects tbe perceived 

shown in FIG. 7. quahty of the data. 
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U. A method of Losertiag a watermark into data as sei 24. A system for inserting a watermark into data as set 

forth in claim 8, where the watermark values arc cfaoseo forth in claim 23, wbere said first transform lens and said 

according to a random distribution. second transform lens are Fourier transform lenses and said 

12. A method of inserting a watermark into data oompris- inverse transform lens is an inverse Fourier transform lens, 
ing the steps of: . 5 25. A method of inserting a watermark into data corapris- 

extracting values of perceptually significant components i^g the steps of: 

of a spectral d^mposition of daa which data is a providing a medium containing data; 

rcprcsentauon of human perceivable material; ^ . . , j . . ^ 

combining watermark values with the extracted values to "^^^'Tf, ' "P^"*'"' dccomposiUon of data to be water- 

create adjusted values; and • niancea, 

inserting the adjusted values into the data in place of the inserting a watermaric into the percepmally significant 

extracted values to produce watermarked data. components of the decomposition of data; and 

13. The method of inserting a watermark into data as set applying an inverse transform to the decomposition of 
forth in claim 12, where watermark values include associ- data with the watermark to generate watermarked data, 
atcd scaling parameters. 26. A method of inserting a watermark into data as set 

14. A method of inserting a watennark into data as set forth in claim 25, where said data comprises image data, 
forth in claim 13, where scaling parameters are selected such 27. A method of inserting a watermark into data as set 
that adding additional watermark value affects the perceived fo^j^, ^ ^^^^ 25. where said data comprises vkieo data, 
quality of the data. ... 28. A method of inserting a watermark into data as set 

15. A method of inserUng a wateraaark into data as set ^^^^ ^ 25. where said data comprises audio data, 
forth m claim 12 where the watermark values are chosen ^9. A method of inserting a watermark into data as set 
accordmc to a random distribution, r .t_ • i • j_ -j j * u- j- 

<r A J c • . t, ■ #^ ,^ forth m claun 25, where said data comprises multimedia 

16. A nKtbod of inserting a watermark mto data as set * ^ 



data. 



forth in claim 12, hirtber comprising: ^ .t. r • * i • * ^ * 

J . * \Za «u»«:«: 30. A method of inserting a watermark mto data as set 

comparmg data with watermarked data for obtaining , . i -ic u u* ■ • . i ^ 

ejctracted data values- claun 25, where said obtaimng a spectral decom- 

, , ' , . , , , position of data is selected firom the group consisting of 

comparing extracted data v^ucs with watermark values p^^^^^ transformation, discrete cosine transformation, Had- 

and data for obtammg difference values; and ^^^^ transformation, and wavelet, muUi-resolution, sub- 

analyzing difference values to determine the watermark in j^^^ method 

the watcnnariced data. , . , 30 31. A method of inserting a watermark into data as set 

17. The method of inserting a watermark mto data as set ^ inserting a watermark inserts 
forth in claim 16, where watermark values inchide associ- watermark values where addition of additional signal into a 
ated scaling parameters. percepmally significant component affects the perceived 

18. A method of inserting a watermark into data as set quality of the data. 

forth in claim 12, where scahng parameters are selected such 3^ ^ ^^^^^ ^f inserting a watermark into daU as set 

that adding additional watermaric value affects the perceived ^ 35, further comprising: 

'*''l9\°L?tbod o^ inserting a watermark into data as set comparing data v^h watermarked data for obtaining 

forth in claim 16, where the watermaric values are chosen extracted data values, 

according to a random distribution. comparing extracted data values with watermark values 

20. A method of inserting a watermark in'to'data as set ^ and data for obtaining difference values; and 

forth in claim 16, further comprising the step of preprooess- analyzing difference values to determine the watermark in 

ing distorted or tampered watermarked data before said the watermarked data. 

comparing data. 33. The method of inserting a watermark into data as set 

21. A method of inserting a watermark into data as sel forth in claim 32, where waieraaark values include associ- 
forth in claim 20, where said distorted or tampered water- ated scaling parameters. 

marked data is clipped data and said preprocessing com- 34. A method of inserting a watermark into data as set 

prises replacing missing portions of the data with corre- forth in claim 33, where scahng parameters are selected such 

spending portions from original unwatermarked data. that adding additional watermark value affects the perceived 

22. A method of inserting a watermark into data as sel quality of the data. 

forth in claim 12, where said combining watermark values 35. A method of inserting a watermark into data as set 

sequentially combines watermark values for a plurality of forth in claim 32. where the watermark values are chosen 

watermarks. according to a random distribution. 

23. A system for inserting a watermark into data com- 36. A method of inserting a watermark into data compris- 
prising: ing the steps of: 

providing image data; providing a medium containing data; 

providing watermark data; extracting values of perceptually significant components 

first transform lens for transforming image data passing of a spectral decomposition of the data; 

therethrough into transformed image data; combining watermark values with the extracted values to 

second transform lens for transforming watermark data gQ create adjusted values; and 

passing therethrough into transformed watermark data; inserting the adjusted values into the data in place of the 

optical combiner for combining the transformed image extracted values to produce watermarked data. 

data and the transformed watermark data to form 37. The method of inserting a watermark into data as set 

transformed watermarked data; and forth in claim 36, where watermark values include associ- 

inverse transform lens for forming watermariced data by 65 ated scahng parameters. 

inverse transformation of transformed watermarked 38. A method of inserting a watermark into data as set 

data. forth in claim 37, where scahng paramctets are selected such 
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that adding addiliooal watermark value afifects the perceived 
quality of the data. 

39. A meibod of ioseiting a watermark into data as set 
forth in claim 36, where the watermark values are choseo 
according to a random disthbution. 

40. A method of inserting a watermark into data as set 
forth in claim 36, further comprising: 

comparing data with watermarked data for obtaining 

extracted data values; 
comparing extracted data values with watermark values 

and data for obtaining difference values; and 
analyziog difference values to determine the watermark in 

the watcmiarkcd data. 

41. The method of inserting a watermark into data as set 
forth in claim 40, where watermark values include associ- 
ated scaling parameters. 

42. A method of inserting a watermark into data as set 
forth in claim 41, where scaling parameters are selected such 
that adding additional watermaiic value affects the perceived 
quality of the data. 
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43. A method of inserting a watermark into data as set 
forth in claim 40, where the watermark values are chosen 
according to a random distnbudon. 

44. A method of inserting a watcnnaric into data as set 
^ forth in claim 40, further comprising the step of preprocess- 
ing distorted or tampered watermarked data before said 
comparing data. 

45. A method of inserting a watemiark into data as set 
]0 forth in claim 44, where said distorted or tampered water- 
marked data is clipped data and sciid preprocessing com- 
prises replacing missing portions of the data with corre- 
sponding portions from original unwatermarked data. 

46. A method of inserting a watermark into data as set 
forth in claim 36, where said combining watermark values 
sequentially combines watermark values for a plurality of 
watermarks. 

« * * * * 
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DIGITAL WATERMARKING 

HELD OF INVENTION 

The present invention relates to digital watcraoarkiag of 
data including image, video and multinsedia data. 
Specifically, the invention relates to the insertion and extrac- 
tion of embedded signals for purposes of watermarking, in 
which the insertion and extraction procedures are repeatedly 
applied to subregions of the data. When these subregions 
correspond to the 8x8 pixel blocks used for MPEG and 
JPEG compression and decompression, the watermarking 
procedure can be tightly coupled with these compression 
algorithms to achieve very significant savings in computa- 
tion. 

BACKGROUND OF THE INVENTION 

The proliferation of digitized media such as image, video 
and multimedia is creating a need for a security system 
which facilitates the identification of the source of the 
material. 

Content providers, i.e. owners of works in digital data 
form, have a need to embed signals into video/image/ 
multimedia data which can subsequendy be detected by 
software and/or hardware devices for purposes of authenti- 
cating copyright ownership, control and management. 

For example, a coded signal might be inserted in data to 
indicate that the data should not be copied. The embedded 
signal should preserve the image fidelity, be robust to 
common signal transformations and resistant to tampering. 
In addition, consideration must be given to the data rate that 
can be provided by the system, though current requirements 
are relatively low — a few bits per fi^ame. 

In U.S. patent application Ser. No. 08/534,894, filed Sep. 
28, 1995,. entitled "Secure Spread Spectrum Watermarking 
for Multimedia Data" now abandoned and assigned to the 
same assignee as the present invention, which is incorpo- 
rated herein by reference, there was proposed a spread 
spectmm watermarking method which embedded a water- 
mark signal into perceptually significant regions of an image 
for the purposes of identifying the content owner and/or 
possessor. A strength of this approach is that the watermark 
is very difficult to remove. In fact, this method only allows 
the watermark to be read if the original image or data is 
available for comparison. This is because the original spec- 
trum of the watermark is shaped to that of the image through 
a non-linear multiplicative procedure and this spectral shap- 
ing must be removed prior to detection by matched filtering 
and the watermark is inserted into the N largest spectral 
coeflBcients, the ranking of which is not preserved after 
watermarking. Thus, this method does not allow software 
and hardware devices to directly read embedded signals. 

In an article by Cox el al., entitled "Secured Spectrum 
Watermarking for Multimedia"' available' at'http:// 
www. neci.nj.com/tr/iodex. html (Technical Report No. 
95-10) spread spectrum watermarking is described which 
embeds a pseucb-random noise sequence into the digital 
data for watermarking purposes. 

The above prior art watermark extraction methodology 
requires the original image spectrum be subtracted from the 
watermark image spectrum. This restricts the use of the 
method when there is no original image or original image 
spectrum available. One application where this presents a 
significant difSculty is for third party device providers 
desiring to read embedded information for operation or 
denying operation of such a device. 
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In U.S. Pat. No. 5319,735 by R. D. Preuss et al entitled 
"Embedded Signalling" digital information is encoded to 
produce a sequence of code symbols. The sequence of code 
symbols is embedded in an audio signal by generating a 

5 corresponding sequence of spread spectrum code signals 
representing the sequence of code symbols. The firequency 
components of the code signal being essentially confined to 
a preselected signahng band lying within the bandwidth of 
the audio signal and successive segments of the code signal 

10 corresponds to successive code symbols in the sequence. 
The audio signal is continuoxisly frequency analyzed over a 
frequency band encompassing the signalling band and the 
code signal is dynamically filtered as a function of the 
analysis to provide a modified code signal with frequency 

15 component levels which arc, at each time instant, essentially 
a preselected proportion of the levels of the audio signal 
frequency components in corresponding frequency ranges. 
The modified code signal and the audio signal are combined 
to provide a composite audio signal in which the digital 

20 information is embedded. This component audio signal is 
then recorded on a recording medium or is otherwise sub- 
jected to a transmission channel. Two key elements of this 
process are the spectral shaping and spectral equalization 
that occur at the insertion and extraction stages, respectively, 

25 thereby allowing the embedded signal to be extracted with- 
out access to the unwatcrmarked original data. 

Id U.S. patent application Ser. No. 08/708331, filed Sep. 
4, 1996,entitled "A Spread Spectrum Watermark for Embed- 
ded Signaling" by Cox; now U.S. Pat. No. 5,848,155 and 

30 incorporated herein by reference, there is described a 
method for extracting a watermark of embedded data from 
watermarked images or video without using an original or 
unwatermarked version of the data. This work can be viewed 
as an extension of the original work of Preuss et al fixam the 

35 audio domain to images and video. 

TTiis method of watermarking an image or image data for 
embedding signaling requires that the DCT (discrete cosine 
transform) and its inverse of the entire image be computed. 
There are fast algorithms for computing the DCT in N log 

^ N lime, where N is the number of pixels in the image. 
However, for N=512x512, the computational requirement is 
still high, particularly Lf the encoding and extracting pro- 
cesses must occur at video rates, i.e. 30 frames per second. 
This method requires approximately 30 times the computa- 
tion needed for MPEG-II decompression. 

One possible way to achieve real-time video watermark- 
ing is to only watermark every N* frame. However, content 
owners wish to protect each and every video frame. 
Moreover, if it is known which frames contain embedded 
signals, it is simple to remove those frames with no notice- 
able degradation in the video signal. 

In U.S. patent application Ser. No. 08/715,953, filed Sep. 
19, 1996, entitled "Watermarking of Image Data Using 
MPEG/SPEG CoefBcients" by Cox, and incorporated herein 
by reference, there is described an alternative method, which 
is to insert the watermark into nxn blocks of the image 
(subimages) where n«N. Then the computation cost is 

60 — rt log« = V logrt. 

For N=512x512=»2^® and n=8x8«2^, the asymptotic sav- 
ing is only a factor of 3. However, empirically the cost of 
65 computing the DCT over the entire image may be sigiufi- 
cantiy higher when cache, loop unfolding and other effi- 
ciency issues are considered. Thus, the practical difference 
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may approach a 30 fold savings. More importantly, if the FIG. 6 is a graphic representation of rotation of PN 

block size is chosen to be 8x8, i.e. the same size as that used sequences; 

for MPEG image compression, then it is possible to tightly FIG. 7 is a graphical representation of an 8x8 block 

couple the watermark insertion and extraction procedxues to shown the spatial relatbn of averaged terms; 

those of the MPEG compression and decompression algo- 5 FIG. 8 is a schematic block diagram of a method for 

rithms. Considerable computational saving can then be inserting watermarks in accordance with the present invcn- 

achieved since the most expenses computations relate to the tion; and 

calculation of the DCT and its inverse and these steps are FIG. 9 is a schematic block diagram of a method for 

already computed as part of the compression and decom- extracting watermarks in accordance with the present inven- 

pression algorithm. The incremental cost of watermarking is 10 ^on. 

then very small, typically less than 5% of the computational DETAILED DESCRIPTION 

requirements associated with MPEG. n r - . .t^ n j n^c^ t u a ■ 

^ Referrmg now to the tigurcs, and rIGS. 1 through 4 m 

The present invention improves the rcliabiUty of the particular, there is shown schematic block diagrams of a 

invention described m the 08/715,953 application, now ^^^^^j method fo^ inserting and detecting watermarks in 

pending by storing wateraaark inforaiation mto submiages, is ^^^^ instance images. 

and extracting wateraiark ir^ormation from subimages, in a following description, reference may be made to 

manner different froni that descnbed earher. ^^^^ ^^^^ ^j,^ invention has applicabUity 

SUMMARY OF THE INVENTION ^ ^^^^ ^'^^"""^ ^ understood that the 

2Q teachmgs he rem and the mventioo itself are equally appu- 

The present invention improves the reliability of the prior cable to video, image and multimedia data and the term 

systems by systematically varying the order in which water- "image" and "image data" will be understood to include 

mark signal components are inserted into each subimage, by these terms where applicable. As used herein, "watermark" 

inserting only part of the watermark signal into each will be understood to include embedded data, symbols, 

subimage, and, during watermark detection, by combining 25 images, instmctions or any other identifying information, 

the watermark signals found in groups of subimages to In the following description, reference is made to proce- 

reconstruct the original watermark signal before testing for dures described in U.S. patent application Ser. No. 08/534, 

correlation with any predefined watermarks. 894 for inserting and extracting or detecting a watermark in 

For detection, a reverse transformation is applied to each images as INSERT- ORIGINAL and EXTRACT- 

subimage to reconstruct the watermark information that was 30 ORIGINAL, respectively. Reference is made to procedures 

stored in that subimage. The resulting signals are then described in U.S. patent application Ser. No. 08/708331 

averaged together to reconstruct the whole watermark, and filed Sep. 4, 1996,now U^. Pat. No. 5,848.155 for inserting 

to reduce noise. Finally, this reconstructed watennark is and extracting or detecting watermarks in images as 

compared against a predefined set of watermark signals to INSERT-WHOLE and EXTRACT- WHOLE, respectively, 

determine which one was inserted into the image. 3s And reference is made to procedures described in U.S. 

Aprindpal object of the present invention is therefore, the P^teat application Ser. No. 08/715,953 for inserting and 

provision of inserting a subset of a watermark into a subset extracting or detecUng watermarks m images as INSERT- 

of subregions of data to be watermarked. ^P^G-A and EXTRACT- MPEG -A, respecUvely. 

, - . r ■ • c FIG, 1 shows a schematic block diagram of INSERT- 

Another object of the mvention is the provision or a ,,„r^Tx, j r • ^* * t - . • 

.... . 1 > . . ... , 1-40 WHOLE procedure for inserting watermarks mto images, 

digital watermarking -system in which a watermark is^^ ^ ^ . . i .ur ^ f 

r . J . • .u * 1 J • ^1 <^ «»u« The watermark signal, m the form of a nmte sequence of 

extracted by averagmg the watermarked signal from subre- i_ , l ii. 1 u u * • ^ - 

c \ 1 J J . j*u 1 7- .u „ w symbols chosen from an alphabet, is provided as an input to 

cionsofwalermarked data, and then correlating the resulung ^ . j^A i_-i_* r 

^. J ^ an error correction encoder 10 which transforms this 

signal to determine the watermark. . . . * • j. j * 

°^ sequence mto another sequence that contains redundant 

A further object of the mvention is the provision of a information. The output of encoder 10 is provided to a 

digital watermarking system m which the watermark is p^.^apper U, which maps each symbol of the encoded 

composed of two portions, a verification portion and a watermark into a pre-specified pseudo-random noise (PN) 

synchronization portion, m order to unprove watermark ^^^^^^ PN-mapper U is provided to a 

extraction reliability spectral transformer 12, which converts the pseudo-random 

Further and still other objects of the invention will noise sequence into the frequency domain. The conversion 

become more clearly apparent when the following descrip- preferably is by discrete cosine transform (DCI), however, 

tion is read in conjunction with the accompanying drawing. fast fourier transform, wavelet type decomposition and the 

like may also be used for frequency conversion. 
BRIEF DESCRIPTION OF THE DRAWINGS Concurrcntiy, the data to be watermarked is provided to 

FIG. lis a schematic block'Sa^ai'of^wate^ 55 another spectral transformer 13. The outputs of the W5 

tion orocedure- spectral transformers 12 and 13 are then provided as mputs 

^ . ' . - . , , J. r . to a spectral shaper 14, which modifies the spectral proper- 

FIG. 2 IS a schematic Wock diagram of a watermark J pseudo-random noise codes from spectral trans- 

msertion procedure in accordance with the teachmgs of the ^^^^^ ^ watermark when added to the image 

present mvention; ^ ^^^^ spectrally transformed data to be watermarked, 

FIG. 3 is a schematic block diagram of a watermark fj-Quj spectral transformer 13, is also provided as an input to 

extraction procedure; a delay 15. The output of the spearal shaper 14 is then added 

FIG. 4 is a schematic block diagram of a watermark to the output of delay 15 at a summer 16. The summer output 

extraction procedure in accordance with the teachings of the is subject to an inverse transform 17. The result of the 

present invention; 65 inverse transform is watermarked data. 

FIG. 5 is a graphic representation of a zigzag pattern INSERT-MPEG- A differs from INSERT- WHOLE by seg- 

uscful for vectorizing subimages; menting the data to be watermarked into multiple blocks^ 
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such as 8x8 pixel subimages or subregions. Eadi block of 
data then has the watermark inserted according to the above 
described method. That is, for each 8x8 subimage or 
subiegion, a pseudo-random number (PN) sequence is 
inserted into the DCT coefficients after suitable ^ctral 5 
shaping. The procedure is repeated for all such subimages or 
subregions. The size of the subimage or subregion is pref- 
erably 8x8, but it can be of other sizes, such as 2x2, 3x3, 4x4 
or 16x16. 

FIG. 2 shows a schematic block diagram of a watermark jq 
insertion procedure in accordance with teachings of the 
present invention. The watermark signal is processed into a 
noise spectrum signal by the error correction encoder 20, the 
PN mapper 21, and the spectral transformer 22^ in the same 
manner as described in conjunction with FIG. 1. However, 
unlike INSERT-WHOLE or INSERT-MPEG-A, the water- 
mark is then used as an input to a watermark segmenter 23, 
which systematically separates the watermark into several 
subwatermarks. Any portion of the original watermaric 
might appear redundantly in several of the resulting subwa- 20 
termarks. Concurrently, the data to be watermarked is used 
as an input to data segmenter 24, which segments the data 
into blocks or subregions, such as 8x8 subimages, as in 
INSERT-MPEG-A. Each of the subwatermarks output by 
the watermark segmenter 23 is then inserted into a data 25 
block by one of the watermark inserters 25a, 25b, etc. The 
procedure used by the watermark inserters 250:^ 2Sb, etc., is 
the same procedure described connection with watermark 
inserter 18 in FIG. 1. That is, each subwatermark is added 
into a spectrally transformed data block after spectral 
shaping, and the resulting data is then transformed back into 
the spatial domain. Finally, the watermarked data blocks 
from the watermark inserters 2Sa,25b, etc., are assembled by 
data combiner 26 to produce watermarked data. 

FIG. 3 shows a schematic block diagram of the 35 
EXTRACT-WHOLE procedure. The watermarked image, 
video or multimedia data is first used as input into a spectral 
normalizer 30 to undo any previously performed spectral 
shaping. If the data contains a watermark, then the output of 
the spectral normalizer 30 wUl resembilecthe spectral trans- 40 
formation of the PN coding of that watermark (the signal 
that was input to the spectral shaper 14 in FIG. 1). The 
output of the spectral normalizer 30 is then used as an input 
to several correlators 31a, 316, etc., which test the water- 
mark with the PN codes used to represent the various 45 
symbols that the encoded watermark might contain (i.e. each 
correlator tests for one PN code that is used to encode a 
symbol by the PN mapper U of FIG. 1). The outputs of the 
correlators 31fl, 31b, etc., are used as inputs to a decision 
circuit 32, which determines the most likely sequence of 50 
symbols. Finally, this sequence is corrected by an error 
corrector 33, which performs the inverse coding that was 
performed by the error correction encoder 10 in FIG. 1. The 
result is the extracted watermark. 

In EXTRACT-MPEG-A, the data from whicli'a water- 55 
mark is to be extracted is first segmented into several blocks, 
such as 8x8 subimages, exactly as in INSERT-MPEG-A. 
The signal from each subimage is then normalized and used 
as input into a bank of conelators similar to the correlators 
31fl, 31^?, etc. in FIG. 3. The output from the correlators is 50 
then averaged with the outputs of corresponding correlators 
from other subimages, and the resulting average correlations 
are used as inputs into the decision circuit 32 for subsequent 
processing as described above. 

FIG. 4 shows a schematic block diagram of a watermark 6S 
extraction procedure in accordance with the teachings of the 
present invention. The watermarked data is first segmented 
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into blocks by data segmenter 40, which corresponds to the 
data segmenter 24 used during the insertion procedure in 
FIG. 2, Each of the data blocks is provided to a re^jective 
spectrum normalizer 41iz^ 41b, etc. to produce a signal 
resembling the subwatermark that was inserted into the 
respective data block. These inserted subwatermark signals 
are then used as inputs into a watermaric combiner 42. In the 
combiner 42, parts of the watermark that appear redundantly 
in several subwatermarks are averaged together to reduce 
noise. The output of the watermark combiner 42 is provided 
as the input to a symbol separator 43 which divides the 
watermark into parts, each of which corresponds to one 
symbol from the encoded watermark signal (the output of 
error correction encoder 20 in FIG. 2), 

These symbols from separator 43 are provided as inputs 
to respective watermark identifiers 44a, 44b etc. each of 
which includes of a bank of correlators and a decision 
circuit, as ^own in FIG. 3. The outputs of the watermark 
identifiers are symbols from the alphabet used in the original 
encoded watermark signal. The identified symbols are reas- 
sembled into a complete encoded watermark by the symbol 
combiner 45. Finally, the resulting encoded watermark is 
decoded by the error corrector 46. 

The insertion and extraction procedures will now be 
described in more detail. In INSERT-ORIGINAL and 
EXTRACT-ORIGINAL, the object is to embed a single PN 
(pseudo random number) sequence into an image when the 
original image is available at the time of extraction. The 
information associated with the PN sequence is assumed to 
be stored in a database together with the original image and 
the spectral location of the embedded watermark. The loca- 
tions of the watermarked components has to be recorded 
because the implementation approximated the N perceptu- 
ally most significant regions of the watermark by the N 
largest coefiBcients. However, this ranking was not invariant 
to the watermarking process. The N largest coefficients may 
be different after inserting the watermark than before insert- 
ing the watermark. 

In order to avoid this problem, the present invention 
places a watermark in predetermined locations of the 
spectrum, typically the first N coeflScients. However, any 
predetermined locations could be used, though such loca- 
tions should belong to the perceptually significant regions of 
the spectrum if the watermark is to survive common signals 
transformations such as compression, scaling, etc. 

More generally, the information to be embedded is a 
sequence of m symbols drawn from an alphabet A (e.g. the 
binary digits or the ASCII symbols). This data is then 
supplemented with additional symbols for error detection 
and correction. Each symbol is then spread spectrum 
modulated, a process that maps each symbol into a unique 
PN sequence known as a chip. The number of bits per chip 
is preset - the longer the chip length, the higher the detected 
signal-to-noisc ratio will be, but this is at the expense of 
signaling bandwidth. 

The power spectmm of the PN sequence is white, i.e. flat, 
and is therefore shaped to match that of the "noise", i.e. the 
image/video/audio/or multimedia data into which the water- 
mark is to be embedded. It is this ^ctral shaping that must 
be modified from the prior methods so that the extraction 
process no longer requires the original image. To do this, 
each coefficient of the watermarked spectrum is scaled by 
the local average of the power in the image ^jectral coef- 
ficient rather than the coefficient itself, i.e. 

f,H^v^X)W, (1) 
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The averaging is the averaging of the absolute coefficient 
values and not the coefficient values ihenaselves. This is 
effectively estimating the average power present at each 
frequency. Other averaging procedures are possible, for 
example, averaging over several frames or average of local 
neighborhoods of 8x8 blocks. 

This average may be obtained in several ways. It may be 
a local average over a two dimensional region. Alternatively, 
the two dimensional ^ectrum may be sampled to form a one 
dimensional vector and a one dimensional local average may 
be performed. One dimensional veclorization of the two 
dimensional 8x8 DCT coefficients is already performed as 
part of MPEG II. The average may be a simple box or 
weighted average over the neighborhood. 

For video data, temporal averaging of the spectral coef- 
ficients over several frames can also be applied. However, 
since several frames arc needed for averaging at the spectral 
normalization stage of the extractor, the protection of indi- 
vidual video frames taken in isolation may not be possible. 
For this reason, the present invention treats video as a very 
large collection of still images. In this way, even individual 
video frames are copy protected. 

In order to extract the watermark, it is necessary to 
perform the spectral normalization, in which the previously 
performed spectral shaping procedure is inverted. In the 
present invention, the original unwatermarked signal is not 
available. Thus, the average power of the frequency 
coefficients, avg(|fj), is approximated by the average of the 
watermarked signal, i.e. avg(|f/|) 

avg(!/i,>^//D (2) 

This is approximately true since aavg([f J)W^. where 
is the watermark component, and ais a constant typically 
in the range between 0.1 and 0.01. 

The normalization stage then divides each coefficient (f,-') 
in the received signal by the local average avg (|f,-'|) in the 
neighborhood. 

That is, 

SI ^ fi^ggy^gmWi (3) 

The first term, on the right hand side (RHS) of Equation 
(3), 

/ 

is considered a noise term. This term was not present in the 
sj^stem described-in U.S. patent application Ser. No. 08/534, 
894, because access to the unwatermarked coefficients 
allowed this term to be removed. The second term aW^ is the 
original watermark signal which can now be detected using 
conventional correlation. 

If the watermark is extracted from any single 8x8 block, 
the detector reliability is very low. If, however, the water- 
marks extracted from each 8x8 block are first added together 
and the averaged watermark is then applied to the correlator, 
then a very strong and uinambiguous response is obtained. 
This differs from the method described in U.S. patent 
application Ser. No. 08/715,953 in which correlation 
occurred within each block and the output from each corr- 
elator was averaged together. The present invention was 
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found to improve the detection response and significantly 
reduced the computation requirement associated with each 
block. 

In practicing the present invention preferably there is a 

5 unique PN sequence for each symbol in the alphabet. The 
method is relatively robust to clipping since the detector 
output reduces linearly with the quantity of 8x8 aibimage 
blocks in the image. For DVD (digital video disk) embedded 
signahng for APS (analog protection system) and CGMS 

10 (copy generation management system), there would be a 
total of 8 or 16 PN sequences. 

The number of 8x8 blocks in a 512x512 image is 4096, 
suggesting that significantly more than one of 16 symbols 
can be embedded in an image or video frame. Assume, for 

15 example, that it is desired to embed 1 out of 128 symbols in 
an image. It is necessary to perform 128 parallel correla- 
tions. This is computationally tractable but hardware imple- 
mentations of each conrelation become more complex. An 
alternative method is to only use two binary symbols. It may 

20 be preferable to associate more than one PN sequence with 
each of the two binary symbols or bits in order to increase 
the difficulty of intentionally removing the watermark. In 
this case, there are only two correlators and a binary string 
may be embedded into the image. The raw bit error rate will 

25 be very high due by the low detector output. However, this 
can be reduced to acceptable levels by using error correcting 
codes, such as Reed-Solomon (RS). RS codes are robust to 
burst error which may occur because of clipping of the 
image. Other error correcting codes may also be used. 

30 When using this method, it is necessary for the receiver to 
know the start location of the encoded block. The start 
location may not be obvious, particularly when the image 
has been subjected to clipping. However, convention syn- 
chronizing methods can be used; such as preceding each 

35 block with a special or unique symbol or string of symbols. 
To insert a watermark, each 8x8 block is treated as an 
individual subimage or sub region. The DCT of the subimage 
is then computed and the two dimensional DCT is vectorized 
in the zigzag pattern shown in FIG. 5, although other 

40 -patterns are also possible. These two stages constitute most 
of the calculations but are part of the MPEG encoding 
process. Next, a PN noise sequence {w^. . . w„} is inserted 
into the DCT coefiBcients using Equation 1 as before. The 
length of the PN sequence cannot exceed 64 (in an 8x8 

45 block) and is typically much shorter, in the range of 11 to 25. 
If only a single code is to be inserted into the image, 
then the same PN sequence is inserted into each of the 
720 X 480/64 -5400 blocks. However, a variation may be 
performed at this point in the procedtire. Within each row of 

50 blocks, the PN sequence is cychcally rotated by one fre- 
quency coefficient prior to insertion in the subsequent block. 
Similarly, the PN sequence is cyclically rotated by one 
frequency coefficient at the start of each new row. FIG. 6 
illustrates an, order of notations. ■ , ;••. * • ;;• 

55 The purpose of these rotations or shifts is to inaprove the 
response of the watermark extraction stage. Earlier experi- 
ments revealed that certain DCT coefficients were more 
difficult to estimate than others. The location at these coef- 
ficients varied from image to image. However, within an 

60 image, the coefficient could be consistently poor. 
Consequently, without shifting, one or more of the estimated 
watermark coefficients could be significantly degraded rela- 
tive to the other watermark coefficients, thereby reducing the 
detector performance. Conversely, shifting significandy 

65 reduces the effect a poor DCT coefficients has on a single 
watermark coefficient and the detector performance is mark- 
edly improved. Note that any cyclic pattern can be used. 
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Further modifications are useful once rotation of the 
watenmark has been introduced. First, the length of the 
watermark may oow be significantly greater than 64 .Then, 
for each block only a small subset of the watermark (say 
five) coefficients is inserted into the first five DCT coeflS- 5 
cients (excluding the d.c. term). Because of the rotation, a 
different subset of the watermark is inserted into neighbor- 
ing 8x8 blocks. Fmally, having completed the watermark 
insertion, the MPEG encoder is able to proceed with the 
subsequent stages of compression. lO 

Note that the watermark may also be inserted after the 
MPEG quantization stage to reduce distortion of the water- 
mark. MPEG-2 performs a convenient one dimension vec- 
torization called "zigzagging", which allows a simple 3x1 
box average to be performed on the coefficients (excluding 15 
the d.c. term). 

la practice, performance was improved if the averaging is 
performed using the 2 four-connected coefficients closest to 
the d.c. term, as illustrated in FIG. 7, i.e. the two coefficients 
above and to the left. 20 

Watermark detection begins by first extracting the PN 
noise sequence from each 8x8 block using Equation l.For 
each block, the PN sequence is then cyclically shifted in the 
opposite direction by one frequency coefficient, and the 
average over all the blocks is then computed. In practice, this 25 
process can be computed incrementaUy and does not require 
temporary storage of all the extracted watermarks. A 
weighted averaging can also be applied, where the weights 
are determined based on their susceptibility to common 
signal transformations such as low pass filtering. Finally, the 30 
average watermark is compared with the original PN 
sequence via correlation. The reason for shifting the water- 
mark in the column direction may now be apparent. If the 
image is clipped on an arbitrary block boundary, then the 
computed average watermark wOl simply be rotated by an 35 
amount that is a function of the relative location of the 
clipped portion of the image. Correlation can then be per- 
formed on all permutations (typically 11 to 25) of the 
watermark. The output fi-om the correlator with the maxi- 
mum vaiue is then used for decision purposes. The extrac- 40 
tion stage is depicted in FIG. 4. Taking the maximum 
correlator output over all rotations of the watermark can 
cause the decision circuitry to be noisy. To improve this, the 
watermark is broken into two pieces; a synchronization 
portion is of length K and a verification portion is N-K. 45 
Then, when the watermark is extracted as before, correlation 
is first pcrforaaed only on all rotations of the synchronization 
portion of this watermark. The maximum correlation output 
is noted, then the verification portion of the watermark is 
rotated by the conesponding amount and a second correla- 50 
tion is performed on the verification portions of the water- 
marks. This process significantly improves the overall reli- 
ability of the system. In the course of experimentation, it was 
noticed that some- watermarks performed better than others 
oni the same imagery. This was caused by variation in the ss 
correlation statistics between the synchronization and veri- 
fication portions of the watermark. Ideally, the two portions 
should have very low correlations. However, in several cases 
where watermarks performed poorly, it was traced to unex- 
pected correlations between the two portions. 60 

The present invention provides a modification to digital 
watermarking methods in which the original data is required 
for watermark extraction thereby enabling watermarking 
extraction in the absence of an unwatermarked or original 
data. The present invention preferably uses MPEG/JPEG 6S 
coefficients. An image is divided into typically 8x8 block 
subimages or subrcgions and each subimage is processed 
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and the results are combined to derive the extracted water- 
mark. The result is extraction of the watermaric with very 
high confidence. 

While the above invention describes improvements to the 
prior-art INSERT-WHOLE, INSERT-MPEG-A, 
EXTRACT-WHOLE, and EXTRACT-MPEG-A algorithms, 
it should be apparent to anyone skilled in the art that the 
same improvements may be applied to any algorithm for 
inserting and extracting watermarks in image data. This 
more general view of the present invention is shown in 
HGS. 8 and 9. 

FIG, 8 shows a schematic block diagram of the general 
method for inserting watermarks. This general method 
makes use of a non-block-based watermark insertion 
algorithm, which shall be referred to hereafter as the "base 
insertion algorithm". The watermark encoder 80 converts 
the watermark into a form appropriate for the base insertion 
algorithm. If the base insertion algorithm is that shown in 
FIG. 1, for example, then the watermaric encoder 80 corre- 
sponds to the watermark encoder 19, which comprises the 
error correction encoder 10, the PN mapper 11, and the 
spectral transformer 12. However, if a different base inser- 
tion algorithm is to be used, then the watermark encoder 80 
may perform a different transformation of the watermark. 
The encoded watermark signal from watermark encoder 80 
is provided as an input to watermark scgmcnter 81, which 
divides the watermark into a set of subwatennarks. Any 
portion of the original watermark might appear redundantly 
in several of the resulting subwatermarks. The data to be 
watermarked is provided as an input to data segmenter 82, 
which divides the data into subregions. Each subwatermaric 
is inserted into a respective data subregion by a watermark 
inserter 83a, 93b, etc. The watermark inserters implement 
the base insertion algorithm, so, if the base insertion algo- 
rithm is that shown in FIG. 1, then each watermark inserter 
83fl^ 83f?, etc., corresponds to the watermark inserter 18, 
which comprises a spectral transformer 13, a spectral shapcr 
14, a delay 15, a summer 16, and an inverse transform 17. 
However, if a different base insertion algorithm is to be used, 
then..the watermark inserters S3a, 83£>, etc., may employ a 
different method of inserting subwatermarks into the subre- 
gions of the data to be watermarked. The outputs from the 
watermark inserters are assembled in data combiner 84 to 
provide watermarked data. 

FIG. 9 shows a schematic block diagram of the corre- 
sponding general extraction algorithm. The algorithm makes 
use of a "base extraction" algorithm that corresponds to the 
base insertion algorithm used in inserting the watermark into 
the data to be watermarked (FIG. 8). The algorithm in FIG. 
9 is substantially the same as the algorithm shown in FIG. 
4, except that, in the general case, the spectrum normalizers 
41a, etc. are replaced by watermark extractors 91a, etc., 
which implement the base extraction algorithm. That is, if 
the base insertion algcathm-used. was the algorithm shown 
in FIG. 1, then the watermark* extractors 91fl, etc., in FIG. 9 
will be the spectrum normalizers 41a, etc. in FIG. 4. 

While there has been described and illustrated a system 
for inserting a watermark into and extracting a watermark 
from watermarked data without using an unwatermarked 
version of the data, it will be apparent to those skilled in the 
art that variations and modifications are possible without 
deviating from the broad principles and teachings of the 
present invention which shall be limited solely by the scope 
of the claims appended hereto. 

What is claimed is; 

1. A method for inserting a watermark signal into data to 
be watermarked comprising the steps of: 
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dividing data to be watennadced into a plurality of sub- 
regions; 

computing frequency ooefiBcients of the data to be water- 

mariced in each subrcgion; 
spread spectrum modulating a watermark signal to be 

inserted by mapping the watermark signal into a PN 

(pseudo- random noise) sequence; 
spectral shaping the PN sequence as a function of the 

average power in each frequency coefficient of the data; 

and 

inserting each spectral shaped PN sequence into prede- 
termined coefiGcicnts in the data in each subrcgion. 

2. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

3. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where said frequency 
coeflScients are DCT (discrete cosine transform) coefficients. 

4. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 3, where each subrc- 
gion is a 8x8 block of pixels. 

5. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 4, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

6. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where each subrc- 
gion is a 8x8 blodc of pixels. 

7. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 6, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

8. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 6, where the frequency 
coefficients of the watermark signal are rotated prior to 
inserting of each spectral shaped PN sequence into the 
subregion. 

9. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 8, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

10. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 8, where only a subset 
of the watermark signal frequency coefficients is inserted 
into any one subregion. 

11. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 10, where the water- 
mark signal comprises a synchronization portion and a 
verification portion. 

12. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 11, where said inserting 
is performed after the data undergoes MPEG quantization 
processing. 

13. A method for inserting a watermark signal into data to 
be. watermarked as set forth in claim 11, where the synchro- 
nization portion and the verification portion have very little 
correlation between each other. 

14. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 1, where the spectral 
shaping as a function of the average power is typically 3x1 
window of the coefficient obtained from the one- 
dimensional vectorization by zigzagging of two^imension 
frequency coefficients. 

15. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim I, where the spectral 
shaping is a function of the average power based on the two 
four-connected frequency coefficients closest to the DC 
term. 
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16. A method of extracting a watermark from water- 
marked data comprising the steps of: 

receiving subregions of watermarked data; 
spectrum normalizing the watermarked data as a function 
5 of the average power in each frequency coefficient of 
the watermarked data in each subregion to generate 
respective normalized signals; 
combining the respective normalized signals frnm each 
subregion to generate a single watermark; 
10 correlating the single watermark with predetermined PN 
(pseudo-random noise) sequences corresponding to 
predetermined symbols to provide correlated signals 
for each predetermined PN sequence in each subregion; 
deciding which correlated signal is most likely a current 
IS symbol; and 

extracting a sequence of most likely current symbols 
corre^onding to the watermark. 

17. A method of extracting a watermark from water- 
marked data as set forth in claim 16, where the subregions 
are 8x8 blocks used for MPEG encoding and decoding. 

18. A method of extracting a watermark from water- 
marked data as set forth in claim 17, where said combining 
the normalized signals from each subregion to generate a 
single watermark, including removing the relative rotation 
of the watermaric between blocks. 

19. A method of extracting a watermark from water- 
marked data as set forth in claim 18, further comprising 
subsequendy reconstructing the watermark from partial 
watermarks inserted into each block. 

20. A method of extracting a watermark from water- 
marked data as set forth in claim 19, further comprising 
weighting the watermark coefficients based on their location 
within the frequency spectrum, where the weighting is a 
function of the susceptibility of each frequency coefficient to 
common signal transformations. 

21. A method of extracting a watermark from water- 
marked data as set forth in claim 16, further comprising 
correlating with all rotational shifts of the extracted water- 
mark and selecting the maximum value. 

22. A method of extracting a watermark from watcr- 
^ marked dadi' 'ais**set forth in claim 16, further comprising 

correlating with all rotational shifts of a synchronization 
portion of a watermark to determine a maximum value and 
subsequently rotating a verification portion of the watermark 
by the same amount as the synchronization portion is rotated 
to obtain the maximum value prior to correlating between 
the verification portion and predetermined PN sequences. 

23. A method of extracting a watermark from water- 
marked data comprising the steps of: 

receiving subregions of watermarked data; 
spectrum normaUzing the watermarked data as a function 
of the average power in each frequency coefficient of 
the watermarked data in each subregion to generate 

respective normalized signals; ^. ^ 

55 correlating the respective normalized signals with prede- 
termined PN sequences corresponding to predeter- 
mined symbols to provide correlated signals for each 
predetermined PN sequence in each subregion; 
deciding which correlated signal is most likely a current 
5Q symbol in each subregion for providing an extracted 
symbol stream; 
error correcting the extracted symbol stream; and 
extracting a sequence of most likely current symbols 
corresponding to the watermark. 
65 24. A method of extracting a watermark from water- 
marked data as set forth in claim 23, where said error 
correction is Reed Solomon error correction. 



04/16/2004, EAST Version: 1.4.1 



5,915,027 



13 



25. A method for inserting a watermark signal into data to 
be watermarked comprising the steps of: 

dividing data to be watermarked into a plurality of sub- 
regions; 

dividing a watermark signal into a plurality of subwater- 
marks where portions of the watermark are contained in 
more than one subwatermark; and 

inserting said plurality of subwatermarks into said plu- 
rality of subregions. 

26. A method for inserting a watermark signal into data to 
be watermarked as set forth in claim 25, where each sub- 
watermark is inserted into a respective sub region, so that 
each subrcgion contains at least one subwatermark. 

27. A method for extracting a watermark signal from 
watermarked data comprising the steps of: 

receiving a plurality of subregions of watermark data; 
extracting a subwatermark from each subregion of said 
plurality of subregions; and 
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combining and averaging the subwatermarks extracted 
from all the subregions to obtain a signal commensu- 
rate with the watermark signal. 
28. A method for extracting a watermark signal from 
watermarked data as set forth in claim 27, further compris- 
ing the steps of: 

dividing the signal commensurate with the watermark 
signal into a plurality of symbol signals; 

correlating each symbol signal with a set of predefined 
signals; 

determining which predefined signal best corresponds to 
each symbol signal; and 

combining the best corresponding predetermined signals 
to generate the watermark signal. 
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SECURE SPREAD SPECTRUM 
WATERMARKING FOR MUOIMEDIA DATA 

This application is a continuation of application Ser. No. 
08/534,894, filed Sep. 28, 1995, now abandoned. 

FIELD OF THE INVENTION 

The present invention cooccras a method of digital water- 
marking for use in audio, image, video and multimedia data 
for the purpose of authenticating copyright ownership, iden- 
tifying copyright infringers or transmitting a hidden mes- 
sage. SpeciJ&cally, a watermark is inserted into the percep- 
tually most significant components of a decomposition of 
the data in a manner so as to be virtually imperceptible. 
More specifically, a narrow band signal representing the 
watermark is placed in a wideband channel that is the data. 

BACKGROUND OF THE INVENTION 

The proliferation of digitized media such as audio, image 
and video is creating a need for a security system which 
facilitates the identification of the source of the material. Hie 
need manifests itself in terms of copyright eoforcemeat and 
identification of the source of the material. 

Using conventional cryptographic systems permits only 
valid keyholder access to encrypted data, but once the data 
is encrypted, it is not possible to maintain records of its 
subsequent representation or transmission. Conventional 
cryptography therefore provides minimal protection against 
data piracy of the type a publisher or owner of data or 
material is confronted with by unauthorized reproduction or 
distribution of such data or material. 

A digital watermark is intended to complement crypto- 
graphic processes. The watermark is a visible or preferably 
an invisible' identification code that is permanently embed- 
ded in the data. That is, the watermark remains with the data 
after any decryption process. As used herein the terms data 
and material will be understood to refer to audio (speech and 
music), images (photographs and graphics), video (movies 
or sequences of images) and multimedia data (combinations 
of the above categories of materials) or processed or com- 
pressed versions thereof. These terms are not intended to 
refer to ASCII representations of text, but do refer to text 
represented as an image. A simple example of a watermark 
is a visible "sear placed over an image to identify the 
copyright owner. However, the watermark might also coo- 
tain additional information, including the identity of the 
purchaser of the particular copy of the image. An effective 
watermark should possess the following properties: 

1. The watermark should be perceptually invisible or its 
presence should not interfere with the material being pro- 
tected. 

2. The watermark must be difficult (preferably virtually 
impossible) to remove from the material without rendering 
the material useless for its intended purpose. However, if 
only partial knowledge is known, e.g. the exact location of 
the watermark within an image is unknown, then attempts to 
remove or destroy the watermark, for instance by adding 
noise, should result in severe degradation in data fidelity, 
rendering the data useless, before the watermark is removed 
or lost. 

3. The watermark sboiild be robust against collusion by 
multiple individuals who each possess a watermarked copy 
of the data. That is, the watermark should be robust to the 
combining of copies of the same data set to destroy the 
watermarks. Also, it must not be possible for colluders to 
combine each of their images to generate a different valid 
watermark. 
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4. The watermark should still be retrievable if common 
signal processing operations are applied to the data. These 
operations include, but arc not limited to diigital-to-analog 
and analog-to-digilal conveision, resampling, requantization 

5 (including dithering aod recompression) and common signal 
enhancements to image contrast and color, or audio bass and 
treble for example. The watermarks in image and video data 
should be immune from geometric image operations such as 
rotation, translation, cropping and scaling. 

5. The same digital watermark method or algorithm 
should be applicable to each of the different media under 
consideration. This is particularly use fill in watennaridng of 
muhimedia material. Moreover, this feature is conducive to 
the implementation of video and imageMdeo watermarking 
using common hardware. 

6. Retrieval of the watermark should unambiguously 
identify the owner. Moreover, the accuracy of the owner 
identification should degrade grace ftilly during attack. Sev- 
eral previous digital watermarking methods have been pro- 

20 posed. L. F. Turner in patent number W089/08915 entitled 
"Digital Data Security System" proposed a method for 
inserting an identification string into a digital audio signal by 
substituting the "insignificant" bits of randomly selected 
audio samples with the bits of an identification code. Bits are 

25 deemed ^'insignificant" if their alteration is inaudible. Such 
a system is also appropriate for two dimensional data such 
as images, as discussed in an article by R. G. Van Schyndel 
et al entitled "A digital watermark** in Intl. Conf. on Image 
Processing, vol 2, Pages 86-90, 1994. The Turner method 

30 may easily be circumvented. For example, if it is kixDwn that 
the algorithm only affects the least significant two bits of a 
word, then it is possible to randomly fiip all such bits, 
thereby destroying any existing identification code. 
An article enritlcd "Assuring Ownership Rights for Digi- 

35 tal Images" by G. Caronni, in Proc. Reliable IT Systems, 
VIS '95, 1995 suggests adding tags — small geometric 
pattems-to-digitized images at brightness levels that are 
imperceptible. While the idea of hiding a spatial watermark 
in an image is fiindamentally sound, this scheme is suscep- 

40"tible:to-attack by filtering and redigitization. The fainter such 
watermarics are, the more susceptible they are to such attacks 
and geometric shapes provide only a limited alphabet with 
which to encode information. Moreover, the scheme is not 
applicable to audio data and may not be robust to common 

45 geometric distortions, especially cropping. J. Brassil ct al in 
an article entitled "Electronic Marking and Identification 
Techniques to Discourage Documeat Copying" in Proc. of 
Infocom 94, pp 1278-1287, 1994 propose three methods 
appropriate for document images in which text is common. 

50 Digital watermarks are coded by: (l)vertically shifting text 
lines, (2) horizontally shifting words, or (3) altering text 
features such as the vertical endlines of individual charac- 
ters. Unfortunately, all three proposals are easily defeated, as 
discussed by the authors. Moreover, -these 4echniqucs are 

55 restricted exclusively to images containing text. 

An article by K. Tanaka et al entitled "Embedding Secret 
Information into a Dithered MuUi- level Image" in IEEE 
Military Comm. Conf., pp216-220, 1990 and K. Mitsui et al 
in an article entitled " Video-Stegaoography" in IMA Intel- 

60 lectual Property Proc, vl, pp 187-206, 1994, describe sev- 
eral watermarking schemes that rely on embedding water- 
marks that resemble quantization noise. Their ideas hinge on 
the notion that quantization noise is typically imperceptible 
Ud viewers. Their first scheme injects a watermark into an 

65 image by using a predetermined data stream to guide level 
selection in a predictive quantizer. The data stream is chosen 
so that the resulting watermark looks like quantization noise. 
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A variation of this scheme is also presented, where a . . }, {13>5}, for example) which are selected based on the 

watermark in the form of a ditberiog matrix is used to dither binary digit to be transmitted. Thus Adelson's method is 

an image in a certain way. There are several drawbacks to equivalent to watermark schemes that encode information 

these schemes. The most important is that they are suscep- into le*st significant bits of the data or its transform 
tible to signal processing, especially requantization, and 5 coefBcienls. Adelson recognizes that the method is suscep- 

geometric atUcks such as cropping. Furthermore, they tible to noise and therefore proposes an alternative scheme 

degrade an image in the same way that predictive coding and "^herein a 2x1 Hadatnard transform of the ^pU»d analog 

ditherine can signal is taken. The differential coemcient of the Hadamard 

, . ^ , transform is oEEset by 0 or 1 unit prior to computing the 

In T^inaka et al, the authors also propose a scheme for -^^^^ transform. This corresponds to encoding the water- 

watemarking facsun lie data. TTiis scheme shortens or 10 significant bit of the differential coefiS- 

lengthens certam runs of data m the nm ieiigth code used to ^j^^, Hadamard transfonn. It is not clear that this 

generate the coded fax miage. This proposal is suscepUble to ^.^ ^^^^ demonstrate enhanced resilience to noise, 

digital-to-analog and analog-to digital wnversions In Furthermore, like all such least significant bit schemes, an 

particular, randomizing the least significant bit (LSB) of ^„ eliminate the watermark by randomization, 

each pixel's mtenaty wdl completely alter the resultmg run is ^ ^ ^ ^^^^ ^^^^^^^^ 

length encoding. Tanaka et d also propose a watermarlong ^ ^ ^^^^ ^ ^^^^^ ^^^^^^ 

me^od for -color-scaled picture and video sequenc^\Th.s J^^^^^ signriTiis is accomplished by analyz- 

metbod applies the same signal transfonn as JPEG (DCT of f„^uency spertrum of the EDTV signal (larger than 

8x8 sub-blocks of an miage)a.rf embeds a watennarkm t^^ J, NTSC signal) and decomposing it info three 

coeffiaeni quantization module^ WhAe bemg compauble 20 ^^^^^^^^^ ^^^^ 

widn cxistmg transform "-ders, this scheme Li quite suscep- respectively). In contrast, the NTSC signal is decomposed 

Uble to requantizatwu and filtermg and is "^l^iyalent to two subbands,L and M.Tlie coefficients. M», within the 

coding the watermark m the least sigmficant bits of the ^ ^^^^ ^ ^^^^ 

translorm coelticienls. coefficients, H^^ of the EDTV signal are scaled such that the 

In a recent paper, by Macq and Quisquater entiUed addition of the H^signal plus any noise present in the system 

"Cryptology for DigiUl TV Broadcasting" in Proc. of the less than the minimum separation between quantization 

IEEE, 83(6). pp944-957, 1995 there is briefly discussed the leveb. Once more, the method relies on modifying least 

issue of watennarking digital images as part of a general significant bits. Presumably, the mid-range rather than low 

survey on cryptography and digital television. The authors frequencies were chosen because they are less percephiaUy 

provide a description of a procedure to insert a watermark significant. In contrast, the method proposed in the present 

into the least significant bits of pixels tocated in the vicinity invention modifies the most percephially significant com- 

of image contours. Since it relies on modifications of the ponents of the signal 

least significant bits, the watennark is easily destroyed. pj^^u ^^ j^ouM be noted that many, if not all, of the prior 

Further, the method is only apphcable to images in that it protocoU are not collusion resistant 

^ks to msert the watermark mto image regions that he on ^^^^^ Corporation of Portland, Oreg.. has 

e e ge o con urs, described work referred to as signature technology for use in 

W. Bender et al in article entided "Techniques for Data identifying digital intellectual property. Their method adds 

Hiding" in Proc. of SPIE, v2420, page 40, July 1995, qj- subtracts small random quantities from each pixels, 
desqribet^g watermarking schemes. The first is a statistical ^ Addition or subtraction is based on.comparing a binary mask 

methocf called "Patchwork". Patchwork randomly chooses n ^ bits with the least significant bit (LSB) of each pixel, 

pairs of image points (a^, b^) and increases the brightness at jf (he LSB is equal to the corresponding mask bit, then the 

a, by one unit whQe correspondingly decreasing the bright- random quantity is added, otherwise it is subtracted. The 

ness of b.. The expected value of the sum of the differences watermark is extracted by first computing the difference 

of the n pairs of points is claimed to be 2n, provided certain between the original and watermarked images and then by 

statistical properties of the image are true. In particular, it is examining the sign of the difference, pixel by pixel, to 

assumed that all brightness levels arc equally likely, that is, determine if it corresponds to the original sequence of 

intensities are uniformly distributed. However, in practice, additions/subtractions. The Digimarc technique is not based 

this is very uncommon. Moreover, the scheme may not be ^^^^ modifications of the image spectrum and does not 

robust to randomly jittering the intensity levels by a single jQ^ke use of perceptual relevance. While the technique 

unit, and be extremely sensitive to geometric afiBne trans- appears to be robust, it may be susceptible to constant 

formations. brighmess ofifeets and to altadcs based on exploiting the high 

The second method is called "texture block coding", degree of local correlation present in an imagp. For example, 
where a region of random texture p^tt.em.foupd, in the image randomly switching the position of similar pixels withia-a 
is copied to an area of the image with similar texture. 55 local neighborhood may significantly degrade the water- 
Autocorrelation is then used to recover each texture region. mark without damaging the image. 
The most significant problem with this technique is that it is jn a paper by Koch, Rindfrey and Zhao entided "Copy- 
only appropriate for images that possess large areas of right Protection for Multimedia Data", two general mcdiods 
random texture. The technique could not be used on images for watermarking images are described. The first method 
of text, for example. Nor is there a direct analog for audio, gg partitions an image into 8x8 blocks of pixels and computes 

In addition to direct work on watermarking images, there the Discrete Cosine Transform (OCT) of each of these 

are several works of interest in related areas. E. H. Adelson blocks. A pseudorandom subset of the blocks is chosen and 

in U.S. Pa. No. 4,939,515 entided "Digital Signal Encoding in each such block a triple of fi-equencies selected from one 

and Decoding >!^»paratus" describes a technique for embed- of 18 predetermined triples is modified so diat their relative 
ding digital information in an analog signal for the piupose 65 strengths encode a 1 or 0 value. The 18 possible triples are 

of itBcrting digital data into an analog TV signal. The analog composed by selection of three out of eight predetermined 

signal is quantized into one of two disjoint ranges ({0,2,4 . frequencies within the 8x8 DCT block. The choice of the 
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eight frequencies to be altered >Mithin the DCT block appears In order to overcome this problem, the fipequency domain 
to be based on the belief that middle frequencies have a of the image data or sound data may be considered as a 
moderate variance level, i.e., they have similar magnitude. communication channel, and correspondingly the watcr- 
This property is needed in order to allow the relative strength mark may be considered as a signal transmitted through the 
of the frequency triples to be altered without requiring a 5 channel. Attacks and intentional signal distortions are thus 
modification that would be perceptually noticeable. Unlike created as noise from which the transmitted signal must be 
in the present invention, the set of frequencies is not chosen ^^^^cks are intentional efforts to remove, delete or 
u J *, 1 • -c^ Z. «ut^,* otherwise overcome the beneficial aspects of the data water- 
based on any perceptual sigmficance or relative energy , . . ^ . T^. . a a ^^u^a 
•J ♦1 ^ .««,™ markmg. While the present invention is mtended to embed 
consKleratons. In addiUon. because the vananoe between ^.^^^rks io data, £e same methodology can be applied to 
the eight frequency coefficients is smaU, one would expect :o ^^^^ ^ ^^^^ 

that the techmque may be sensitive to noise or distortions. , ^ r . i • . i . ■ -c 

™ . . ^ J . L , t ^ J • .t- Instead of encodms the watermark mto the least signin- 

This is supported by the experimental results reported m the ile»u;<iu ui "^"^^^ ^""^ '"'^^ uiotiw m u m ^ 

Koch et Tpaper, supra, where it is reported that the can components of the data, the present mvenUon considers 

"embedded labels are mbust against JPEG S^mpression fi^r ^PP^yj^S ^"^'^P^ spread spectrum communication In 

1* r 1 ^ * cfw» spread spectrum communications, a narrowband signal is 

a Duahty factor as low as about 50% . In contrast, the 15 *^ . 5 , t. i u j -j.i. ^ *u * »u 

J J J • -J -^u *k * w ^ transmitted over a much larger bandwidth such that the 

method described m accordance with the teachings of the . , . • i 

.... , * * J -*u signal energy present in any single frequency is impercep- 

present mvention has been demonstrated with compression ^^^^ ^^^^P^^ Planner, the wate^ark is spread over 

quality factors as low as 5 percent. ^ . ^^^^ ^^^^^^^ ^ ^^^^^^ ^ ^.^^^ ^ 

An earUer proposal by Koch and Zhao m a paper cmided ^^^^ imperceptible. Since the watermark verification 

"Toward Robust and Hidden Image Copyright Labeling^' 20 includes a priori knowledge of the locations and 

proposed not triples of frequencies but pairs of frequencies content of the watermarics, it is possible to concentrate these 

and was again designed specifically for robustness to JPEG ^^^^ ^-^^^^ ^ ^^^^^ ^^^^ 3 ^^^^ ^gpal 

compression. Nevertheless, the report states that "a lower to-noise ratio. Destruction of such a watermark would 

quality factor will increase the likelihood that the changes ^^^^ ^^j^ ampUtude to be added to every 
necessary to superimpose the embedded code on the signal 25 ^g^^^jj^^y 

wiU be noticeably visible^ accordance with the teachings of the present invention. 

In a second method, proposed by Koch and Zhao, a watermaris is inserted into the perceptually most significant 

designed for black and white images, no frequency trans- regions of the data decomposition. The wateraiark itself is 

form is employed. Instead, the selected blocks arc modified designed to appear to be additive random noise and is spread 

so that the relative frequency of white and black pixels throughout the image. By placing the watermark into the 

encodes the final value. Both watermarking procedures are perceptually significant components, it is much more dififi- 

particularly vulnerable to multiple document attacks. To ^ult for an attacker to add more noise to the components 

protect against diis, Zhao and Koch proposed a distributed without adversely affecting the image or other data. It is the 

8x8 block ofpixels created by randomly sampling 64 pixels f^ct that the watermark looks like noise and is spread 

from the image. However, the resulting DCT has no rela- throughout the image or data which makes the present 

tionship to that of the true image. Consequently, one would scheme appear to be similar to spread spectrum methods 

expect such distributed blocks to be both sensitive to noise ^g^j communications system. 

and likely to cause noticeable artifacts in the image. Spreading the watermark throughout the spectrum of an 
In summary, prior art digital watennarkin^ techniques are ^ image ensures a large measure of security against uninten- 
not robust and the watermark is easy to remove. In addition, tional or intentional attack. First, the location of the water- 
many prior techniques woukl not survive common signal mark is not obvious. Second, frequency regions are selected 
and geometric distortions in a fashion that ensures severe degradation of the original 

- „ ^r, ^v™«v, data following any attack on the watermark. 

SUMMARY OF THE INVENTION A watennaric that is weU placed in the frequency domain 

The present invention overcomes the limitations of the of an image or a sound track will be practically impossible 

prior art methods by providing a watermarking system that to see or hear. This will always be the case if the energy in 

embeds an unique identifier into the percepmally significant the watermark is sufficiently small in any single frequency 

componentsof a decomposition of an image, an audio signal coefficient. Moreover, it is possible to increase the energy 
or a video sequence. so present in particular frequencies by exploiting knowledge of 

Preferably, the decomposition is a spectral frequency masking phenomena in the human auditory and visual 

decomposition. The watermark is embedded in the data's systems. Percepmal masking refers to any situation where 

perceptually significant frequency components. This is information in certain regions of an image or a sound is 

because an effective watermark cannot be locatedun -per- occluded by perceptually more prominent information in 
cepmally insignificant regions of image data or in its fre- 55 another pah of the image or sound. In digital waveform 

quency spectrum, since many common signal or geometric coding, this frequency domain (and in some cases, time/ 

processes affect these components. For example, a water- pixel domain) masking is exploited extensively to achieve 

mark located in the high frequency spectral components of low bit rate encoding of data. It is clear that both auditory 

an image is easily removed, with minor degradation to the and visual systems attach more resolution to the high energy, 
image, by a process that performs low pass filtering. The 60 low frequency, spectral regions of an auditory or visual 

issue then becomes one of how to insert the watermark into scene. Further, spectrum analysis of images and sounds 

the most significant regions of the daU frequency spectrum reveals that most of the information in such dau is often 

without the alteration being noticeable to an observer, i.e., a located in the low frequency regions, 

human or a machine feature recognition system. Any spec- In addition, particularly for processed or compressed data, 
tral component may be altered, provided the alteration is 65 percepmally significant need not refer to human perceptual 

small. However, very small alterations are susceptible to any significance, but may refer instead to machine perceptual 

noise present or intentional distortion. significance, for instance, machine feature recognition. 
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To meet ihese requirements, a walermaric is proposed DETAILED DESCRIPTION 

whose structure comprises a large quantity, for instance jn onlc better understaod the advantages of the 

1CX)0, of randomly generated numbers with a normal distri- invention, the preferred embodiment of a frequeiK;y spec- 

bution having zero mean and unity variance. A binary tnim based watennaridng system will be described. It is 
watermark is not chosen because it is much less robust to 5 instructive to examine the processing stages that image (or 

attacks based on collusion of several independently water- sound) data may undergo in the copying process and to 

marked copies of an image. However, generally, the water- consider the effect that such processing stages can have on 

mark might have arbitrary structure, both deterministic the data. Referring to HG. 1, a watermarked image or sound 

and/or random, and inchiding uniform distributions. The data 10 is transmitted 12 to undergo typical distortion or 
length of the proposed watermark is variable and can be to intentional tampering 14. Such distortions or tainpcnng 

adjusted to suit the characteristics of the dau. For example, includes lossy compression 16, geometnc distortion 18, 

longer watermarks might be used for images that are espe- signal processing 20 and D/A and A/D conversion 22. After 

cially sensitive to large modifications of its spectral undergomg distomon or tampermg corr^^^ 
coefficients, thus requires weaker scaling factors foV indi- ^ transmuted 26. The process of 

^ ^ . "transmission" refers to the application of any source or 

vidual components. i5 ^^^^^^^ ^^^q^ encryption techniques to the data. 

The watermark is then placed in components of the image while most transmission steps are information lossless, 

spectrtim. These components may be chosen based on an j^^j^y compression schemes (e.g., JPEG, MPEG, etc.) may 
analysis of those components which are most vulnerable to ' potentially degrade the quali^ of the data through irretriev- 

attack and/or which are most percepmally significant. This ^jjig j^^s of data. In general, a watermarking method should 
ensures that the watermark remains with the image even 20 resilient to any distortions introduced by transmission or 

after common signal and geometric distortions. Modification compression algorithms. 

of these spectral components results in severe image deg- ^ossy compression 16 is an operation that usuaUy elimi- 

radation long before the watermark itself is destroyed. Of ^^^^^ percepUially irrelevant components of image or sound 

course, to insert the watermark, it is necessary to alter these ^^^^ ^^^^ preserve a watennark when undergoing 
very same coeffiaents. However, each modification can be 25 ^^^^ compression, the watermark is located in a pcrcepm- 
extrcmely smaU and, in a manner smiilar to spread specUimi significant region of the daU. Most processing of this 

communication, a strong narrowband watermark may be ^^^^ ^ frequency domain. Data loss usually 

distributed over a much broader image (channel) ^eclmm. ^ frequency components. Thus, the water- 

Concepmally, detection of the watermark then proceeds by ^^^^^ ^^^^^^ ^ significant frequency component 

adding aU of these very small signals, whose locations are ^^^^ ^^^^ spectrum to minimize the 

only known to the copyright owner, and concentrating the adverse affects of lossy compression, 
watermark into a signal with high signal-to-noise ratio. ^^^^ - ^ encounter many common 

Because the location of the watermark is only Imown to the transformations that are broadly categorized as geometric 

copyright holder, an attacker would have to add very much distortions or signal distortions. Geometric distortions 18 are 
more noise energy to each spectral coeflBcient m order to be -g^ ^ . ^ ^^^^^ ^^^^^^ _ 

confidentofremovingthewatermark.However,thisprocess ^ ^^^^^^^^ translation, scaling and cropping. By 

would destroy the unage. manuaUy determining a minimum of four or nine corte- 

Prefcrably, a predetermined number of the largest cocf- spending points between the original and the distorted 
ficienls of the DCT (discrete cosine transform) (excluding ^.. watermark, it is possible to remove any two or three dimen- 

the DC term) are used. However, the choice of the DCT is sional affine U^formation. However, an affine scaling 

not critical to the algorithm and other spectral transforms, (shrinking) of the image results in a loss of data in the high 

including wavelet type decompositions are also possible. In frequency spectral regions of the image. Cropping, or the 

fact, use of the FFT rather than DCT is preferable from a cutting out and removal of portions of an image, also results 

computational perspective. jn inetrievable loss of data. Cropping may be a serious threat 

The invention will be more clearly understood when the to any spatially based watermark but is less likely to affect 

following description is read in conjunction with the accom- a frequency-based scheme. 

panying drawing. Common signal distortions include digital-to-analog and 

analoc-to-digital conversion 22, resampling, requantization, 

BRIEF DESCRIPTION OF THE DRAWING induing dithering and recompression, and common signal 

FIG. 1 is a schematic representation of typical common enhancements to image conU-ast and/or color, and audio 

processing operations to which data could be subjected; fircquency equalization. Many of these distortions are non- 

cir- 1 V . o#: ««»at,v„ « r^^^f^^r^A cwct^rr, Uncar. and it is difficult to analyze their effect in either a 

FIG. 2 is a schematic representation ot a preierred system , r i_ j *u j tt c * *u-.. 

r ; , r^.t^^S i^ir^ «n .'rn^^p- spatial.or frequency .based , method.. However, the fact that 

' for mimersmg a watermark mto an image, i . •'. . — . ,^ ^.^ 

■ „ ^ ^ J. ^ 55 the origmal image is known allows many signal traosfor- 

FIGS. 3a and 3t are flow charts of the encoding and ^^^^^ ^ ^^^^^^ approximately. For example, 

decoding of watermarks; histogram equalization, a common non-linear contrast 

FIG. 4 is a graph of the responses of the watermark enhancement method, may be substantially removed by 

detector to random watermarks; histogram specification or dynamic histogram warping tech- 

FIG. 5 is a graph of the response of the watermark niques. 
detector to random watermarks for an image which is Finally, the copied image may not remain in digital form, 

successively watermarked five times; Instead, it is likely to be printed or an analog recording made 

no. 6 is a graph of the response of the watermark (analog audio or video tape). These reproductions introduce 

detector to random watermarks where five images, each additional degradation into the image data that a watermark- 
having a different watermark, and averaged together; and 55 ing scheme must be robust to. 

FIG. 7 is a schematic diagram of an optical embodiment Tampering (or attack) refers to any intentional attempt to 

of the present invention remove the watermark, or corrupt it beyond recognition. The 
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watermarfc must not only be resistant to the inadvertent ^/W***^ 0) 

application of distortioiis. It must also be immune to inteo- Equation 1 is invcrtiblc. Equations 2 and 3 arc invcrtiblc 

tional manipulation by malicious parties. These manipula- vv^en Hierefore, given X' it is possible to compute the 

tions can include combinations of distortions, and can also inverse fimciion necessary to derive W* from X and X*. 

include collusion and forgery attacks. 5 Equation 1 is not the preferred formula when the values 

FIG. 2 shows a preferred system for inserting a watermark x,- vary over a wide range. For example, if Xi=10** then 

into an image in the frequency domain. Image data X(iJ) adding 100 may be insufficient to establish a watermark, but 

assumed to be in digital form, or alternatively data in other if x,-=10, then adding 100 will unaccepiably distort the value, 

formats such as photographs, paintings or the like, that have Insertion methods using equations 2 and 3 are more robust 

been previously digitized by well-known methods, is subject 10 when encountering such a wide range of values x,-. It will 

to a frequency transformation 30, such as the Fourier trans- *lso be observed that equation 2 and 3 yield similar results 

form. A watermark signal W (k) is inserted into the fre- when aw,- is small. Moreover, when x,- is positive, equation 

quency spectrum components of the transformed image data ^ ^ equivalent to ln(xj«ln(x>ax^ and may be consklered 

32 applying the techniques described below. The frequency ^ *° appHcaUon of equaUoo 1 when aatiiral logaritos of 

spectrum image daU including the watermark signal is is values are used For example if [wj^i and 

subjected to an inverse frequency transform 34, resulting in ""^^h ^"^.Vf"!^ guarantees that the spectral 

... , Kl. -x - . • • J- 1 coefficient will chance by no more than 1%. 

watermarked maage data X(ij), which may remain m digital „ ^ . i- • 1 i- 

^ !_ • J \ J & For certain apphcations, a single scalmc parameter a may 

form. or be prmted as an analog representation by well- * l u * c w • n - 1 c -n. t 

, . e> r J 5g jjggj j-Qj. combmme all values or x,. rhcrciorc, 

known methods. 1 i- * u j -^l 

, . ^ i. - . - multiple scalmg parameters a., . . . , cl, can be used with 

After applymg a frequency transformation to the image 20 equations 1 to 3 such as x,^x, (l+a,w^. The values 

data 30, a perceptual mask is computed that highlights ^ ^ ^^^^^^ ^^^^^ ^^^^ ^ 

promment regions in the frequency spectmm capable of ^^^^ ^ perceptual quality of the document. A 

supporting the watermark without overly affectmg percep- ^ ^^^^^ ^^le to alter x, by a 

tual fidehty. TTiis may be performed by using knowledge of ^ ^^^^^ ^^^^^ perceptually degrading the document, 

the perceptual sigmficancx of each frequency in the 25 ^ ^^^^^ ^^^^^^ ^ ^^^^ ^ 

spectrum, as discussed carher, or simply by rankmg the ^^^^ ^^^^j ^ example, equa- 

frequcnacs based on their energy. TTic latter method was 2 is a special case of the generalized equation 1, 

used m expcrmicnts described below. (x;ox.-+a,-x^, for a.^,. That is, equation 2 makes the 

In general, it is desired to place the watermark m regions reasonable assumption that a large value of x.- is less 

of the spectrum that are least affected by common signal 30 sensitive to additive alteration that a small value of x^. 

distortions and arc most significant to image quaUty as Generally, the sensitivity of the image to different values 

perceived by a viewer, such that significant modification ^t. unknown. A method of empirically estimating the 

would destroy the image fidelity. In practice, these regions sensitivities is to determine the distortion caused by a 

could be experimentally identified by applying common number of attacks on the original image. For example, it is 

signal distortions to images and examining which finequen- 35 ^^^^^ compute a degraded image D* from D, extract 

cies are most affected, and by psychophysical studies to corresponding values x^*, - . . ,x„* and select a, to be 

identify how much each component may be modified before proportional to the deviation |x,*-x,|. For greater robustness, 

significant changes in the image are perceivable. possible to try other forms of distortion and make a, 

The watermark signal is then inserted into these promi- proportional to the average value of |x,-*-x,| Instead of using 

'^'hent regions in a way that makes any tampering create '"^ the average distortion,' it is possible to use the median or 

visible (or audible) defects in tbe data. The requirements of maximum deviation. 

the watermark mentioned above and the distortions common Alternatively, it is possible to combine the empirical 

to copying provide constraints 00 the design of an electronic approach with general global assumptions regarding the 

watermark. sensitivity of the values. For example, it might be required 

In order to better understand the watermarking method, that c^^o^- whenever x,^x^. This can be combined with the 

reference is made to FIGS. 3(a) and 3(6) where from each empirical approach by setting according to 
document D a sequence of values X^Xj, . . . pc„ is extracted 

40 with which a watermark W=Wj, . . . ,w„ is combined 42 ^ ^ _ ^ 

to create an adjusted sequence of values X=x\, . . . ,x'„ ' [j/^j^^i] * * 
which is then inserted back 44 into the document in place of 
values X in order to obtain a watermark document D'. An 

attack of the document D\ or other distortion, will produce ^ ^^^^ sophisticated approach is to weaken the monotonic- 

a document D*. Having the original document D and the constraint to be robust against occasional outliers, 

document D*,^ a * possibly 'fcorrupted- watermark W* is l^'^S^ waterm.ark, .n, detemines the .dpgree to; 

extracted 46 and compared to watermark W 48 for statistical '^e watermark is spread among the relevant compo- 

analysis 50. The values W are extracted by first extracting of the image data. As the size of the watermark 

a set of values X*-Xi*,...,x„* from D* (using information increases, so docs the number of altered spectral 

about D) and then generating W* from the values X* and the components, and the extent to wbich each component need 

values X altered decreases for the same resilience to noise. Con- 

When combining the values X with the watemark values *° watermarks of the form x/-x^w, and a white nokc 

W in step 42. scaling parameter a is specified. The scaling ^'^''} ''^ '^l^{^>' ^^"^ chosen according to inde- 

paramctcr a determines the extent to which values W alter P™''.?"' dBtnbuUons with standard deviation a. It is 

values X. TTiree preferred formulas for computing X' are: PO^bte to recover the watemark when a is proporUonal to 

orn. That is, quadruplmg the number or components can 
65 halve the magnitude of the watermark placed into each 

jCf'-XfHS3Vi (1) component. The sum of the squares of the deviations 

(2) remains essentially unchanged. 



04/16/2004, EAST Version: 1.4.1 



5,930369 

11 12 

lo general, a wateimark comprises an arbitrary sequence used, but the use of wavelet based schemes are also useable 

of real numbers W^w^, . . . ,w„. In practice, each value as a variation. In terms of selecting frequency regions of the 

may be chosen independently from a normal distribution transform, it is possible to use models for the perceptual 

N(0,1), where with mean fi and variance or of system under consideration. 

a uniform distribution from {1,-1} or {0,1}. 5 Frequency analysis may be performed by a wavelet or 
It is highly unlikely that the extracted mark W* will be sub-band transform where the signal is divided into sub- 
identical to the original watermark W. Even the act of bands by means of a wavelet or multi-resolution transform, 
requantizing the watermarked document for transmission The sub-bands need not be uniformly spaced. Each sub-band 
will cause W* to deviate from W. A preferred measure of the may be thought of as representing a frequency region in the 
similarity of W and W* is 10 domain corresponding to a sub-region of the frequency 

range of the signal. The watennark is then inserted into the 

w - w (4) sub-recions. 

yiw*-w* audio data, a sliding '^window'* moves along the 

signal data and the frequency dransform (DCT, FFT, etc.) is 

15 taken of the sample in the window. This process enables the 

Large values of sim (W,W*) are significant in view of the capture of meaningful information of a signal that is time 

following analysis. Assume that the authors of document D* varying in nature. 

had no access to W (either through the seller or through a gacb coef&dent in the frequency domain is assumed to 

watermarked document). Then for whatever value of W* is have a perceptual capacity. That is, it can support the 

obtained, the conditional distribution on w. will be indcpen- 20 insertion of additional information without any (or with 

dently distributed according to N(0,1). In this case, minimal) impact to the perceptual fidelity of the data. 

In order to place a length L watermark into an NxN image, 

r * ^ ^ the NxN FFT (or DCT) of the image is computed and the 

^ J " ^' * watermark is placed into the L highest magnimde coeflB- 

25 cients of the U^ansform matrix, excluding the DC component. 
More generally, L randomly chosen coefficients could be 

Thus, sim(W,W*) is distributed according to N(0,1). Then, chosen from the M, M most perceptually significant 

one may apply the standard significance tests for the normal coefficients of the transform. For most images, these coef- 

distribution. For example, if D* is chosen independendy ficients will be the ones corresponding to the low frequen- 

fromW, then it is very unlikely that sim(W,W*)>5. Note that 30 cies. The purpose of placing the watermark in these loca- 

somewhat higher values of sim (W,W*) may be needed tions is because significant tampering with these frequencies 

when a large number of watermarks are on file. The above will destroy the image fidelity or perceived quality well 

analysis required only the independence of W from W*, and before the watermark is destroyed. 

did not rely on any specific properties of W* itself. This fact The FFT provides perceptually similar results to the DCT. 

provides further flexibility when preprocessing W* . 35 This is different than the case of transform coding, where the 

The extracted watermark W* may be extracted in several DCT is preferred to the FFT due to its spectral properties, 

ways to potentially enhance the ability to extract a water- The DCT tends to have less high frequency information than 

mark. For example, experiments on images encountered that the FFT, and places most of the image information in the 

instances where the average vahie of W*, denoted E^W*), low frequency regions, making it preferable in situations 

differed substantially from.O, due to the effects of a dithering 40 where data need to be eliminated. In the - case"of 

procedure. While this artifact could be easily eliminated as watermarking, image data is preserved, and nothing is 

part of the extraction process, it provides a motivation fiDr eliminated. Thus the FFT is as good as the DCT, and is 

postprocessing extracted watermarks. As a result, it was preferred since it is easier to compute, 

discovered that the simple transformation w,-*<w>-EXW*) In an experiment, a visually imperceptible watermark was 

yielded superior values of sim (W,W*). The improved 45 intentionaUy placed in an image. Subsequently, 100 ran- 

performance resulted from the decreased value of W*. W*; domly generated watermarks, only one of which corre- 

the value of W*. W was only slightly affected. sponded to the correct watermark, were applied to the 

In experiments it was frequently observed that W;* could watermark detector described above. The result, as shown in 

be gready distorted for some values of i. One postprocessing FIG. 4, was a very strong positive refuse corre^onding to 

option is to simply ignore such values, setting them to 0. 50 the correct watermark, suggesting that the method results in 

That is, a very low number of false positive responses and a very low 

false negative response rate, 

f if iw- 1 > tolerance another test, the watermarked image was scaled to half 
. - ^ ... - • -.i ; ^ - of its original size. In order to recover the watermark, the, 

otherwise • ; 55 image was re-scaled to its original size, albeit with loss of 

detail due to subsampling of the image using low pass spatial 

ThegoalofsuchatransformationistolowcrW*-W*.Aless operations. The response of the watennark cktcctor 

abrupt version of this approach is to normahze the W* was well above random chance levels, suggesUng that the 

values to be either -1,0 or 1, by watermark is robust to geometric distortions. This result was 

60 achieved even though 75 percent of the original data was 
missing from the scaled down image. 

Wi*<si^{wr-Eiy^)). jjj 3 further experiment, a JPEG encoded version of the 

This transformation can have a dramatic effect on the image with parameters of 10 percent quality and 0 percent 

statistical significance of the result. Other robust statistical smoothing, resulting in visible distortions, was used. The 

techniques could also be used to suppress outlier effects. 6S results of the watermark detector sxiggest that the method is 

In principle, any frequency domain transform can be used. robust to common encoding distortions. Even using a ver- 

In the scheme described below, a Fourier domain method is sion of the image with parameters of the 5 percent quality 
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and 0 percent anooihing, the results were well above that In FIG. 7, data to be watermarked sucb as an image 52 is 

achievable due to random chance. passed through a spatial transform lens 54, such as a Fourier 

In experiments using a dithered version of the image, the transform lens, the output of which lens is the spatial 

response of the watermark detector suggested that the transform of the image. Concurrently, a watermark image 56 
method is robust to common encoding distortion. Moreover, 5 is passed through a second spatial transform lens 58, the 

more reliable detection is achieved by removing any non- output of which lens is the spatial transfer of the watermark 

zero mean from the extracted watermark. image 56. The spatial transform from lens 54 and the spatial 

In another experiment, the image was clipped, leaving transform from lens 58 are combined at an optical combiner 

only the central quarter of the image. In order to extract the 60. The output of the optical combiner 60 is passed through 
watermark £rom the clipped image, the missing portion of lo an inverse spatial transform lens 62 from which the water- 

the image was replaced with portions from the original mark image 64 is present. The result is a unique, virtually 

uQwatermarked image. The watermark detector was able to imperceptible, watermarked image. Similar results are 

recover the watermark with a response greater than random. achievable by transmitting video or multimedia signals 

When the non-zero mean was removed, and the elements of through the lenses in the manner described above, 
the watermark were binarized prior to the comparison with 15 While there have been described and illustrated spread 

the correct watermark, the detector response was improved. spectrum watermarking of data and variations and modifi- 

This result is achieved even though 75 percent of the data cations thereof, it will be apparent to those skilled in the art 

was removed from the- image. that further variations and modifications are possible without - 

In yet another experiment, the image was printed, deviating from the broad principles and spirit of the present 
photocopied, scanned using a 300 dpi Umax PS-2400x 20 invention which shall be limited solely by the scope of the 

scaimer and rescaled to a size of 256x256 pixels. Qearly, the claims appended hereto, 

final image suffered from different levels of distortion intro- What is claimed is: 

duced at each process. High frequency pattern noise was 1. A method of inserting a watermark into data comprising 

particularly noticeable. When the non-zero mean was the steps of: 

removed and only the sign of the elements of the watermark 25 obtaining a spectral decomposition of data to be water- 
was used, the watermark detector response improved to well marked which data is a representation of humanly 
above random chance levels. perceivable material; 

In stm another experiment, the image was subject to five .^^ing a watermark into the perceptually significant 

successive watermarking operations. That is the ongmal components of the decomposition of data: and 

image was watermarked, the watermarked image was 30 , . . . c . ^ r 

, , , - ' ™ L J J applyiDfi an inverse transform to the decomposition of 

watermarked, and so forth. The process may be considered \ ° , , . i j 

- r I • I.- iT • 1 *L * • a . data with the watermark for gcneratmg watermarked 

another form of attack in which it is clear that significant ^^^^ ^ ^ 

image degradation occurs if the process is repeated. FIG. 5 r- ^- . .r^u 

* ^tT c *u * J J . \- ♦ 2. A method of insertmg a watermark into data asset forth 

shows the response of the watermark detector to 1000 - i-^. -jj. 

J , * J * 1 ■ 1 J- c . in claim 1, where said data comprises image data, 

randomly generated watermarlcs, mcluding the five water- 35 - ^ \ . r- ^- . i • . j . 

, r™Lj-.T-*L 3. A method of insertmg a watermark into data as set forth 

marks present m the image. The five dominant spikes in the - ,- ^t_ -jj* 

, . J. ^. r ii_ c £ * 1 m claim 1, where said data comprises video data, 

graph, mdicative of the presence of the five watermarks, . . A. ^ r- ^- . i • * j » *f 

L . / 1* J * * * _c -^t. 4. A method of insertmg a watermark into data as set forth 

show that successive watermarking does not mterrcrc with - . -jj • j-j. 

the rocess ^ claim 1, where said data comprises audio data, 

eprocess. . ui 5. A method of inserting a watermark into data asset forth 

The fact that sucoessivei watermarking IS possible means 40 . , . ^ , ._, j . • i.- " 

t J * - J * • ui f in claim 1, where said data comprises multimedia data, 

that the history or pedigree of a document IS determmable if . . j . _i • . ^ * 

. .'^ . ,j , . 6. A method of insertmg a watermark into data as set forth 

successive watermarking IS added with each copy. • i • . -j • • . ij r 

, . ^. r .L 1*- 1 . -I • c m claim 1, where said obtaining a Spectral decomposition of 

In a vanation of the multiple watennark image, five \ _^ * c - c r- 

^ , ^ J J • J * *u * data is selected from the group consistmg of Founer 

separately watermarked images were averaged together to ^ r ^* j- ^ - * r *- »tj j 

■ ^ , • ^ 1 • t rir^ £ u .u transformation, discrete cosine transformation, Hadamard 

simulate simple conclusion attack, rid. 6 snows the 45 ^ _ ^. , . . . . . . . j 

r^if . 1 J * _^ * lArtA J 1 transformation, and wavelet, multi-resolution, sub-band 

response of the watermark detector to 1000 randomly gen- method 

crated watermarks, including the five watermarks present in n ' ^t. ^ c * i-*^* 

• . 1- rr* u • .t_ . • 1 11 • i_ J 7. A method of insertmg a watermark into data as set forth 

the ongmal images. The result IS that simple collusion based - -j*^* . i-_ 

• ' ' a: c 4 . in claim 6, where said insertmg a watermark mserts water- 
on averaging is ineffective m defeating the present water- , * . .... jj-.- t • i • * 
J. & r mark values where addition of additional signal into a 

1^ cli. L . • *L * *i- J 1- J perceptually significant component afEects the perceived 

The result of the above experiments is that the described ualit of the (Sta. 

system can extract a reliable copy of the watermark from o I .Ljr- ^- . i-.j. .c^l 

/ . i_ L - £ 1 J J J 1. 8. A method of insertmg a watermark into data as set forth 

images that have been significantly degraded through sev- .,.-.0^1. • • 

, , . . . . ^ J m claim 1, curther comprising: 

eral common geometric and signal processmg procedures. ' ' re* 

TTiese procedures inchide zooriing (low pass fiiiering); 55 companng data with watermarked data for oblaimng" 

cropping, lossy JPEG encoding, dithering, printing, photo- extracted data values; 

copying and subsequent rescanning. comparing extracted data vahies with watermark values 

While these experiments were, in fact, conducted using an and data for obtaining difference values; and 

image, similar results are attainable with text images, audio analyzing difference values to determine the watermark in 

data and video data, although attention must be paid to the 60 the watermarked data. 

time varying nature of these data. 9. The method of inserting a watennark into data as set 

The above implementation of the watermarking system is forth in claim 8, wheye watermark values include associated 

an electronic system. Since the basic principle of the inven- scaling parameters. 

tion is the inclusion of a watermark into spectral frequency 10. A method of inserting a watermark into data as set 
components of the data, watermarking can be accomplished 65 forth in claim 9, where scaling parameters are selected such 

by other means using, for example, an optical system as that adding additional watermark value affects the perceived 

shown in FIG. 7. qiiality of the data. 
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11. A method of inserting a watermark into data as set 
forth in claim 8, where the watermark. values are chosen 
according to a random distribution. 

12. A method of inserting a watermark into data compris- 
ing the steps of: 

extractiDg values of perceptually significant components 
of a spectral decomposition of data which data is a 
representation of human perceivable material; 

combining watermark values with the extracted values to 
create adjusted values; and 

inserting the adjusted values into the data in place of the 
extracted values to produce watermarked data. 

13. The method of inserting a watermark into data as set 
forth in claim 12, where watermark values include associ- 
ated scaling parameters. 

14. A method of inserting a watermark into data as set 
forth in claim 13, where scaling parameters are selected such 
that adding additional watermark value affects the perceived 
quality of the data. 

15. A method of inserting a watermark into data as set 
forth in claim 12, where the watermark values are chosen 
according to a random distributioD. 

16. A method of inserting a watermark into data as set 
forth in claim 12, further comprising: 

comparing data with watermarked data for obtaining 

extracted data values; 
comparing extracted data values with watermark values 

actd data for obtaining difference values; and 
analyzing difference values to determine the watermark in 

the watermarked data. 

17. The method of inserting a watermark into data as set 
forth in claim 16, where watermark values include associ- 
ated scaling parameters. 

18. A method of inserting a watermark into data as set 
forth in claim 12, where scaling parameters are selected such 
that adding additional watermark value affects the perceived 
quality of the data. 

19. A method of inserting a watermark into data as set 
forth in claim 16, where the watermark values arc chosen 
according to a random distribution. 

20. A method of inserting a watermark intcT^ata as set 
forth in claim 16, further comprising the step of preprocess- 
ing distorted or tampered watermarked data before said 
comparing data. 

21. A method of inserting a watermark into data as set 
forth in claim 20, where said distorted or tampered water- 
marked data is chpped data and said preprocessing com- 
prises replacing missing portions of the data with corre- 
sponding portions from original unwatermarked data. 

22. A method of inserting a watermark into data as set 
forth in claim 12, where said combining watermark values 
sequentially combines watermark values for a phirality of 
watermarks. 

23. A system for inserting a watermark into data com- 
prising: . ' " V 

providing image data; 
providing watermark data; 

first transform lens for transforming image data passing 

therethrough into transformed image data; 
second transform lens for transform iag watermark data 

passing therethrough into transformed watermark data; 
optical combiner for combining the transformed image 

data and the transformed watermark data to form 

transformed watermarked data; and 
inverse transform lens for forming watermarked data by 

inverse transformation of transformed watermarked 

data. 
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24. A system for inserting a watermark into data as set 
forth in claim 23, where said first transform lens and said 
second transform lens are Fourier transform lenses and said 
inverse transform lens is an inverse Fourier transform lens. 

25. A method of inserting a watermark into data compris- 
ing the steps of: 

providing a medium containing data; 
obtaining a spectral decomposition of data to be water- 
marked; 

inserting a watermark into the perceptually significant 
components of the decomposition of data; and 

applying an inverse transform to the decomposition of 
data with the watermark to generate watermarked data. 

26. A method of inserting a watermark into data as set 
forth in claim 25, where said data comprises image data. 

27. A method of inserting a watermark into data as set 
forth in claim 25, where said data comprises video data. 

28. A method of inserting a watermark into data as set 
forth in claim 25, where said data comprises audio data. 

29. A method of inserting a watermark into data as set 
forth in claim 25, where said data comprises multimedia 
data. 

30. A method of inserting a watermark into data as set 
forth in claim 25, where said obtaining a spectral decom- 
position of data is selected from the group consisting of 
Fourier transformation, discrete cosine transformation, Had- 
amard transformation, and wavelet, multi-resolution, sub- 
band method. 

31. A method of inserting a watermark into data as set 
forth in claim 30, where said inserting a watermark inserts 
watermark values where addition of additional signal into a 
perceptually significant component affects the perceived 
quality of the data. 

32. A method of inserting a watermark into data as set 
forth in claim 25, further comprising: 

comparing data with watermarked data for obtaining 

extracted data values; 
comparing extracted data values with watermark values 

and data for obtaining difference values; and 
analyzing difference values to determine the watermark in 

the watermarked data. 

33. The method of inserting a watermark into data as set 
forth in claim 32, where watermark values include associ- 
ated scaling parameters. 

34. A method of inserting a watermark into data as set 
forth in claim 33, where scaling parameters are selected such 
that adding additional watermark value affects the perceived 
quality of the data. 

35. A method of inserting a watermark into data as set 
forth in claim 32, where the watermark values are chosen 
according to a random distribution. 

36. A method of inserting a watermark into data compris- 
ing' the steps of : • ^ 

providing a medium containing data; 

extracting values of perceptually significant components 

of a spectral decomposition of the data; 
combining watermark values with the extracted values to 

create adjusted values; and 
inserting the adjusted values into the data in place of the 

extracted values to produce watermarked data. 

37. The method of inserting a watermark into data as set 
forth in claim 36, where watermark values include associ- 
ated scaling parameters. 

38. A method of inserting a watermark into data as set 
forth in claim 37, where scaling parameters are selected such 
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that adding additional watermai^ value affects the perceived 
quality of the data. 

39. A method of inserting a watermark into data as set 
forth in claim 36, where the watermark values are chosen 
according to a random distribution. 5 

40. A method of inserting a watermark into data as set 
forth in claim 36, further comprising: 

comparing data with watermarked data for obtaining 

extracted data values; 
comparing extracted data values with watermark values 

aod data for obtaining difference values; and 
analyzing difference values to determine the watermark in 

the watermarked data. 

41. The method of inserting a watermark into data as set 
forth in claim 40, where watermark values include associ- 
ated scaling parameters. 

42. A method of inserting a watermark into data as set 
forth in claim 41, where scaling parameters are selected such 
that adding additional watermark value affects the perceived 
quality of the data. 
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43. A method of inserting a watermark into data as set 
forth in claim 40, where the watermark values are chosen 
according to a random distribution. 

44. A method of inserting a watermark into data as set 
forth io claim 40, further comprising the step of preprocess- 
ing distorted or tampered watermarked data before said 
comparing data. 

45. A method of inserting a watermark into data as set 
forth in claim 44, where said distorted or tampered water- 
marked data is chpped data and said preprocessing com- 
prises replacing missing portions of the data with corre- 
sponding portions firom original unwatermarked data. 

46. A method of inserting a watermark into data as set 
forth in claim 36, where said combining watermark values 
sequentially combines watermark values for a plurality of 
watermarks. 



* * * * * 
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METHOD AND APPARATUS FOR SCENE- techniques, however, is the resolution of rightful ownership 

BASED VIDEO WATERMARKING of digital data when multiple ownership claims are made, 

i.e., the deadlock problem. Watermarking schemes that do 

RELATED DOCUMENTS not use the original data set to detect the watermark are most 

„ . , ^ r.^,.,^ ,S vulnerable to deadlock. A pirate simply adds his or her 

TiK application claur^ the benefit of U S. Provisional watermarked data. It is then impossible to 

Apphcalion No. 60/^4.979, filed Aug. 1996, which is ^^^^^^^^^ watermarked the data first, 

hereby mcorpo rated by reference. U.S. Provisional AppL- , . , . - . - • , j . . 

cation No. 60/050487, filed Jun. 24, 1997. the benefit of , Watermartang procedures that require the onginal dau set 

which is also claimed, is also hereby incorporated by ref- ,„ for watermark detection also suffer from deadlocks. In such 

erence. Co-Bled applications entiUed "Method and Appara- '° schemes a party other than the owner may coun erfe. t a 

tus for Embedding Data. Including Wateraiarks. in Human watermark by ' subtractmg off a second watermark from the 

Perceptible Sounds." Appl. Ser. No. 08/918,891. now U.S. P^'^^'^y ^^^.''^''l* '1* ''^f °[ 

Pat. No. 6,061.793, "Method and Apparatus for Embedding onpnal^ TT"* second watermark dlows the pirate to claim 

Data. Including Watermarks, U. Human Perceptible „ copyright ownership smce he or she can sh^^^^ 

Images." Appl. Ser. No. 08/918,122, now U.S. Pat. No. " publicly available data and the ong.nal of the ngbtful owner 

6,031,914 and "Method and Apparatus for Video « "'Py °^ counterfeit watemiatk. 

Watermarking," Appl. Ser. No. 08/918,125, and "Digital Th"e is a need, therefore, for watenmarking procedures 

Watermarking to Resolve Multiple Qaims of Ownership," appUcable to video digital data that do not suffer from the 

Appl. Ser. No. 08/918,126 are also hereby incorporated by described shortcomings, disadvantages and problems, 

reference. SUMMARY OF THE INVENTION 

STATEMENT REGARDING GOVERNMENT The above-identified shortcomings, disadvantages and 

RIGHTS problems found within the prior art are addressed by the 

present invention, which will be understood by reading and 

The present invention was made with govemmeot support 25 studying the foUowing specification. The invention provides 

by AFOSR under grant AF/F49620-94-1-0461, NSF grant scene-based watermarking of video data. 

INT-9406954, and AF/F49620-93-1-0558. Tlie Govermnent embodiment of the invenlon, scenes are extracted 

has certain nghts in this invenUon. ^^^^ ^^^^ ^^^^ ^ ^^^^^ 

FIELD OF THE INVENTION successive frames. Each scene thus includes a number of 

frames. Each frame undergoes a wavelet transformation. 

This invention relates generally to techniques for embed- which is then segmented into blocks. A frequency mask is 

ding data such as watermarks, signatures and captions in applied to the corresponding frequency-domain blocks, 

digital data, and more particularly to scene-based water- which is then weighted with the author signature, also in the 

marks in digital data that relates to video. frequency domain. The resulting weighted block is taken out 

of the frequency domain, and then weighted with the spatial 

BACKGROUND OF THE INVENTION ^^sk for its corresponding wavelet U-ansformed block. A 

Digital video is readily reproduced and distributed over unique watermark generation routine is also described that 

information networks. However, these attractive properties ^^i^ts m the resolution of deadlock, 

lead to problems enforcing copyright protection. As a result, The approach of the invention provides advantages over 

creators and distributors of digital video are hesitant to the approaches found in the prior art. In the prior art, an 

provide access to their digital intellectual property. Digital independent watermark applied to each frame may result in 

watermarking has been proposed as a means to identify the detection of the watermark by statistically comparing or 

owner and distribution path of digital data. Digital water- averaging simUar regions and objects in successive video 

marks address this issue by embedding owner identification 45 frames, as has been described in the background. However, 

directly into the digital data itself. The information is the inventive scene-based approach addresses this issue by 

embedded by making small modifications to the pixels in embedding a watermark tiiis is a composite of static and 

each video frame. When the ownership of a video is in dynamic components, the dynamic components preventing 

question, the information can be extracted to completely detection by statistical comparison across frames. Therefore, 

characterize the owner or distributor of the data. 50 statistical comparison or averaging does not yield the water- 

\^deo watermarking introduces issues that generally do mark, 

not have a counterpart in images and audio. Video signals Further aspects, advantages and embodiments of the 

are highly redundant by nature, with many frames visually invention will become apparent by reference to the 

similar to each other. Due to large amounts of data and drawings, and by reading the foUowing detailed description, 

inherent redundancy between frames, video signals are 55 BRIEF DESCRIPTION OF THE DRAWINGS 
highly susceptible to pirate attacks, including frame 

averaging, frame dropping, interpolation, statistical analysis, FIG. 1 is a flowchart of a method of a video watermarking 

etc. Many of these attacks may be accomplished with litUe process according to an embodiment of the mvention; 

damage to the video signal. A video watermark must handle FIG, 2 is a flowchart of a method of an object-based video 

such attacks. Furthermore, it should identify any image watermarking process according to an embodiment of the 

created from one or more frames in the video. invention; 

Furthermore, to be useful, a watermark must be perccp- FIG. 3 is a diagram of a typical computer to be used with 

tually invisible, statistically undetectable, robust to distor- embodiments of the invention; 

tions applied to the host video, and able to resolve multiple FIG. 4 is a block diagram of a specific implementation of 

ownership claims. Some watermarking techniques modify 65 scene -based video watermarking, based on the methods of 

spatial/temporal data samples, while others modify trans- FIG. 1 and FIG. 2, according to an embodiment of the 

form coeflBcients. A particular problem afflicting all prior art invention; 
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FIG. 5 is a diagram showing a masking weighting func- It can be shown that generating x or y from partial knowl- 

tion k(f) according to one embodiment of the invention; and, edge of y is computationally infeasible for the Blum/Blum/ 

FIG. 6 is a diagram showing a two-band perfect recon- Shub generator. The classical maximal length pscudo noise 

struclion filter in accordance with which a wavelet transform ^f'^^^''^, ' ^-sequence) generated by Imear feedback 

can be computed according to one embodiment of the S ^Mi registers are not used for this pmpose Sequences 

, ^ * generated by shift registers are cryptographically insecure, 

mven ion. ^^^^ feedback pattern (i.e., the keys) 

DETAILED DESCRIPTION OF THE 5^^^° ^ ^maU number of output bits y. 

INVENTION Thus, a pirate is not free to subtract off a second water- 

mark y' arbitrarily. The pirate must supply the keys xl' and 

In the following detailed description of the preferred x2' which generate the watermark y' they wish to embed. It 

embodiments, reference is made to the accompanying draw- computationally infeasible to invert the one-way function 

ings which form a part hereof, and in which is shown by way y'-g{xl',x2') to obtain xl' and x2'. Furthermore, x2' is not 

of illustration specific preferred embodiments in which the arbitrary. It is computed directly from the original video 

invention may be practiced. These embodiments are signal, which is inaccessible to the pirate. As a result, the 

described in sufficient detail to enable those skilled in the art two-key pseudo-random sequence author representation 

to practice the invention, and it is to be understood that other resolves the deadlock problem. 

embodiments may be utilized and that logical, mechanical g^^p jj, a wavelet transform is applied along the 

and electrical changes may be made without departing from temporal axis of the video host data, resulting in a multi- 

thc spirit and scope of the present invention. The following resolution temporal representation of the video. In 

detailed description is, therefore, not to be taken in a hmiting particular, the representation consists of temporal lowpass 

sense. frames and highpass frames. The lowpass frames consist of 

Overview of Ihe Watermarking Process 1'^'= '=''»P°n'=''ts in the video scene The highpass 

frames capture the motion components and chaogmg nature 

Referring to FIG. 1, a flowchart of a method of a video ^ of the video sequence (i.e., the video host data). The 
watermarking process, according to one embodiment of the watermark is designed and embedded in each of these 
invention, is shown. Specifically, the method of FIG. 1 - components. The watermarks embedded in the lowpass 
imbeds watermark data into host video data. In step 10, the frames exist throughout the entire video scene. The water- 
watermark data is generated, which is the signature, or marks embedded in the motion frames are highly localized 
watermark, that acts as a unique identifier for the host video in time and change rapidly from frame to frame. Thus, the 
data. Note that the signature inherently is spread across the watermark is a composite of static and dynamic compo- 
frequency spectrum without explicit spread-spectrum pro- nents. The combined representation overcomes drawbacks 
cessing. associated with a fixed or independent watermarking pro- 

In one embodiment of the invention, the signature is a cedure. (I.e., avoidance of watermark detection by statistical 

pseudo-random sequence, which is created using a pseudo- comparison between successive frames is achieved.) 

random generator and two keys. With the two proper keys, A wavelet transform can be computed using a two -band 

the watermark may be extracted. Without the two keys, the perfect reconstmction filter bank as shovm in FIG, 6. The 

data hidden in the video is statistically invisible and impos- video signal is simultaneous passed through lowpass L filter 

sible to recover. Pseudo -random generators are well within 70 and highpass H filter 72 and then decimated by 2 (as 

the art. For example, the reference R. Rivest, represented by elements 74 and 76 of FIG. 6) to give static 

"Cryptography " in Handbook of Theoretical Computer Sci- (no motion) and dynamic (motion) components of the origi- 

ence (J. van Leeuwen, ed.), vol. 1, ch. 13, pp. 711-155 ^ nal signal. The two decimated signals may be up sampled (as 

Cambridge, Mass.: MIT Press, 1990, which is hereby incor- represented by elements 78 and 80), and then passed through 

porated by reference, describes such generators. complementary filters 82 and 84 and summed as represented 

In one embodiment, the creation of the watermark data in 45 by element 86 to reconstmct the original signals. Wavelet 

step 10 works as follows. The author has two random keys filters are widely available within the art. For instance, the 

xl and x2 (i.e., seeds) from which the pseudo-random reference P. P. Vaidyanathan, Multirate Systems and Filter 

sequence y can be generated using a suitable cryptographic Banks, Englcwood Cliffs, N.J.: PTR Prentice-Hall, Inc., 

operator g(xl,x2), as known within the art. The noise-like 1992, which is hereby incorporated by reference, describes 

sequence y, after some processing, is the actual watermark 50 such filters. 

hidden into the video stream. The key xl is author depen- Referring back to FIG. 1, in step 12, the data generated by 
dent. The key x2 is signal dependent. In particular, xl is the step 10 is imbedded into a perceptual mask of the host video 
secret key assigned to (or chosen by) the author. Key x2 is data as represented by the temporal wavelet transform of 
computed from the video signal which the author wishes to step 11. The present invention employs perceptual masking 
watermark. The signal dependent key is computed from the 55 models to determine the optimal locations within host data 
masking values of the original signal. The masking values in which to insert the watermark. The perceptual mask is 
give us tolerable error levels in the host video signal. The specific to video host data. The mask provides for the 
tolerable error levels are then hashed to a key x2. watermark data generated by step 10 to be embedded with 
The operator g( ) is called a pseudo-random sequence the host data, at places typically imperceptible to the human 
generator For the pseudo-random generator to be useful, a 60 eye. That is, the perceptual mask exploits masking proper- 
pirate must not be able to predict bits of y or infer the keys ties of the human visual system. Step 12 embeds the 
xl or x2 from knowledge of some bits of y. There are several watermark within the temporally wavelet transformed host 
popular generators that satisfy these properties, including data such that they will not be perceived by a human eye, as 
RSA, Rabin, Blum/Micali, and Blum/Blum/Shub, as known defined by the perceptual model. The perceptual masking of 
within the art. For example, the Blum/Blum/Shub pseudo- 65 step 12 is conducted in the frequency domain, 
random generator uses the one way function y=g(x)='X*x Thus, image masking models based on the human visual 
mod n, where n-pq for primes p and q so that p-q-3mod4. system (HVS) are used to ensure that the watermark embed- 
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ded into each video frame is perceptually invisible and 
robust. VisTial masking refers to a situation where a signal 
raises the visual threshold for other signals around it. Mask- 
ing characteristics are used in high quality low bit rate 
coding algorithms to further reduce bit rates. The masking 
models presented here arc based on image models. 

The masking models give the perceptual tolerance for 
image coefficients and transform coefiBcients. These mask- 
ing models are also described in the reference B. Zhu, et al., 
"Low Bit Rate Near-Transparent Image Coding," in Proc. of 
the SPIE Int'l Conf. on Wavelet Apps. for Dual Use, vol. 
2491, (Orlando, Ra.), pp, 173-184, 1995, which is hereby 
incorporated by reference, and in the reference B. Zhu, et al., 
"Image Coding with Mixed Representations and Visual 
Masking," in Proc. 1995 IEEE Int'l Conf. on Acoustics, 
Speech and Signal Processing, (Detroit, Mich.), pp. 
2327-2330, 1995, which is also hereby incorporated by 
reference. The frequency masking model is based on the 
knowledge that a masking grating raises the visual threshold 
for signal gratings around the masking frequency. The model 
is based on the discrete cosine transform (DCT), expresses 
the contrast threshold at frequency f as a function of f, the 
masking frequency fm and the masking contrast cm: 



the image, as those skilled in the art can appreciate. 
Furthermore, under certain simplifying assumptions 
described in the Zhu "Bit Rate ..." reference, the tolerable 
error level for a pixel p(x,y) can be obtained by first 
computing the contrast samration at (x,y) 



25 



where co(f) is the detection threshold at frequency f. The 
mask weighting function k(f) is shown in FIG. 5. To find the 
contrast threshold c(f) at a frequency f in an image, the DCT 
is first used to transform the image into the frequency 
domain and find the contrast at each frequency. The value 
a=0.62 as determined experimentally by psycho-visual 
tests, and as described in G. E. Legge and J. M. Foley, 
"Contrast Masking in Human Vision," Journal Optics Soci- 
ety of America, vol. 70, no. 12, pp. 1458-1471 (1980), 
which is hereby incorporated by reference. Then, a summa- 
tion rule of the form 

is used to sum up the masking effects from all the masking 
signals near f If the contrast error at f is less than c(f), the 
model predicts that the error is invisible to human eyes. 

In step 14, the host video data as subjected to a temporal 
wavelet transform in step U, with the embedded watermark 
data from step 12 is further subjected to a non-frequency 
mask. Because the pcrccpmal mask in step 12 is a frequency 
domain mask, a further mask is necessary to ensure that the 
embedded data remains invisible in the host video data. The 
non-frequency mask is a spatial mask. 

Frequency masking effects are localized in the frequency 
domain, while spatial masking effects are localized in the 
spatial domain. Spatial masking refers to the simation that an 
edge raises the perceptual threshold around it. Any model for 
spatial masking can be used, and such models are well 
known in the art. However, the model used in one embodi- 
ment of the invention is similar to the model described in the 
Zhu, "Low Bit Rate ... " reference previously incorporated 
by referenced, and which is itself based on a model proposed 
by Girod in "The Information Theoretical Significance of 
Spatial and Temporal Masking in Video Signals," in Pro- 
ceedings of the SPIE Human Vision, Visual Processing, and 
Digital Display, vol. 1077, pp. 178-187 (1989), which is 
also herein incorporated by reference. 

In one embodiment, the upper channel of Girod's model 
is linearized under the assumption of small perceptual errors, 
the model giving the tolerable error level for each pixel in 
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where the weight W4(x,y,x',y') is a Gaussian centered at the 
point (x,y) and T is a visual test based threshold. Once 
dc,a/x,y) is computed, the luminance on the retina, dl„„ is 
obtained from the equation 

From dl^,„ the tolerable error level ds(x,y) for the pixel 
p(x,y) is computed from 



30 
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The weights Wi(x,y) and W2(x,y) are based on Girod's 
model. The masking model predicts that changes to pixel 
p(x,y) less than ds(x,y) introduce no perceptible distortion. 

As have been described, steps 10, 11, 12 and 14 of FIG. 
1 provide an overview of the video watermarking process of 
the present invention. An overview of the scene-based video 
watermarking process of the present invention is now 
described. 

Overview of the Scene -Based Video Watermarking 
Process 

Referring to FIG. 2, a flowchart of a method of a scene- 
based video watermarking process, according to one 
embodiment of the invention, is shown. The method utilizes 
the watermarking method of FIG. 1 already described. In 
step 24, a video sequence (i.e., the host video data) is broken 
(segmented) into scenes, as known within the art. For 
example, the reference J. Nam and A, H. Tewfik, "Combined 
Audio and Visual Streams Analysis for Video Sequence 
Segmentation," in Proceedings of the 1997 International 
Conference on Acoustics, Speech and Signal Processing, 
(Munich, Germany), pp. 2665-2668 (April 1997), which is 
hereby incorporated by reference, describes such scene 
segmentation. Segmentation into scenes allows the water- 
marking procedures to take into account temporal redun- 
dancy. Visually similar regions in the video sequence, e.g., 
frames from the same scene, must be embedded with a 
consistent watermark. The invention is not limited to a 
particular segmentation into scenes algorithm, however. 

In step 26, a temporal wavelet transform is applied on the 
video scenes, as has been previously described. That is, each 
scene comprises a number of frames, such that a temporal 
wavelet transform is applied to each frame within a scene. 
The resulting frames are known as wavelet frames. The 
multiresolution nature of the wavelet transform allows the 
watermark to exist across multiple temporal scales, resolv- 
ing pirate attacks. For example, the embedded watermark in 
the lowest frequency (DQ wavelet frame exists in all frames 
in the scene. 

In step 28, a watermark is embedded in each wavelet 
frame. The watenmark is designed and embedded in the 
wavelet domain, such that the individual watermarks for 
each wavelet frame are spread out to varying levels of 
support in the temporal domain. For example, watermarks 
embedded in highpass wavelet frames are localized tempo - 
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rally. Conversely, watermarks embedded in lowpass wavelet includes keyboard 18, display device 20 and pointiag device 
frames are generally located throughout the scene in the 22. Display device 20 can be any of a number of different 
temporal domain. The watermarks are embedded in accor- devices, including a cathode-ray tube (CRT), etc. Pointing 
dance with perceptual and non-frequency masks, as has been device 22 as shown in FIG. 3 is a motisc, but the invention 
described. That is, the watermarks are embedded in each S is not so limited. Not shown is that computer 16 typically 
frame of each scene in accordance with perceptual and also comprises a random-access memory (RAM), a read- 
spatial (non -frequency) characteristics of the frame, as has only memory (ROM), a central -processing unit (CPU), a 
been described in conjunction with the method of FIG. 1. fixed storage device such as a hard disk drive, and a 
The sccnc-bascd video watermarking method of the removable storage device such as a floppy disk drive. The 
invention has several other advantages. It is scene-based and 10 computer program to implement the present invention is 
video dependent, and directly exploits spatial masking, typically written in a language such as C, although the 
frequency masking, and temporal properties such that the present invention is not so limited. 

embedded watermark is invisible and robust. TTie watermark The specifics of the hardware implementation of the 

consists of static and dynamic temporal components that are invention have been described. A particular implementation 
generated from a temporal wavelet transform of the video 15 of the scene-based video watermarking of the invention, 

scenes. The resulting wavelet frames are modified by a based on the methods of FIG. 1 and FIG. 2, is now described, 
perceptually shaped pseudo-random sequence representing 

the author (owner). The noise-like watermark is statistically Particular Implementation of Scene-Based Video 
undetectable to thwart unauthorized removal. Furthermore, Watermarking 
the author representation resolves the deadlock problem. 20 embodiment shown in FIG. 4 illustrates a particular 
The multiresoluuon watermark may be detected on smgle implementation of scene-based video watermarking accord- 
frames without knowledge of the location of the frames m invention, as based on the methods of FIG. 1 and 
the video scene. Pjq 2 that have already been described. Referring now to 

Because the video watermarking procedure is perception- piG. 4, a block diagram of this specific implementation of 
based, the watermark adapts to each individual video signal. ^ scene-based video watermarking is shown. Video frames 32 

In particular, the temporal and frequency distributions of the (of video host data) are denoted such that Fi is the ith frame 

watermark arc controlled by the masking characteristics of fn a video scene, where i=0, . . . , k-1. Frames are ordered 

the host video signal. As a result, the strength of the sequentially according to time. Each frame is of size nxm. 

watermark increases and decreases with host, e.g., higher jhe video itself may be gray scale (8 bits/pixel) or color (24 

amplitude in regions of the video with more textures, edges, bits/pixel). Frames 32 undergo a temporal wavelet transfor- 

and motion. This ensures that the embedded watermark is mation 34, as has been described, to become wavelet frames 

invisible while having the maximum possible robustness. 35, The tilde representation is used to denote a wavelet 

Because the watermark representation is scene-based and representation. For example, F-i is the ith wavelet coefi&- 

multiscalc, given one or more frames from a potentially cient frame. Without loss of generality, wavelet frames are 

pirated video, the watermark may be extracted from the ordered from lowest frequency to highest frequency — ix.y 

frames without knowledge of the location of the frame being F~0 is a DC frame. Thus, there are k wavelet coefScient 

tested. This detection characteristic exists due to the com- frames F~i, i=0, . . . , k-1. 

bined static and dynamic representation of the watermark. step 38, each wavelet frame F-i is segmented into 8x8 
The watermark representation of the invention provides blocks B~ij, i=0, 1, . . . , (n/8) and j=0, 1, . . . , (m/8). In step 
an author representation that solves the deadlock problern. 40, each block B~ij is subjected to a discrete cosine trans- 
The author or owner of the video is represented with a form (DCI), to become block B~ij'. In step 42, a perceptual 
pseudo -random sequence created by a pseudo-random gen- frequency mask, as has been described, is applied to each 
erator and two keys. One key is author dependent, while the block to obtain the frequency mask M'ij. In step 44, author 
second key is signal dependent. The representation is able to signature Yij — the watermark— also undergoes a discrete 
resolve rightful ownership in the face of multiple ownership cosine transform to become Y'ij. It should be noted that the 
claims. generation of author signamre Yij is desirably in accordance 
The watermark representation of the invention also pro- with the process that has been described in conjunction with 
vides a dual watermark. The watermarking scheme uses the step 10 of FIG. 1, but the invention is not so limited, 
original video signal to detect the presence of a watermark. 50 In step 46, the mask M'ij is used to weight the noise -like 
The procedure can handle virtually all types of distortions, author Y'ij for that frame block, creating the frequency- 
including cropping, temporal resealing, frame dropping, shaped author signature P'ij=M'ij Y'ij. In step 48, the spatial 
etc., using a generalized likelihood ratio test. This procedure mask S-ij is generated, as has been described, and in step 50, 
is integrated with a second watermark which does not the wavelet coefficient watermark block W-ij is obtained by 
require the original signal to address the deadlock problem. 55 computing the inverse DCT of P'ij in step 52 and locally 
As have been described, steps 24, 26, and 28 of HG. 2 increasing the watermark to the maximum tolerable error 
provide an overview of the scene-based watermarking pro- level provided by the spatial mask S-ij. Finally, in step 54, 
cess of the present invention. The ^edfics of the hardware the watermark W~ij is added to the block B-ij, creating the 
implementation of the invention are now provided. watermarked block. The process is repeated for each wavelet 

60 coefficient frame F^i. 

Hardware Implemenladon of the Invention watermark for each wavelet coefEcient frame is the 

The present invention is not limited as to the type of block concatenation of all the watermark blocks for that 

computer on which it runs. However, a typical example of frame. The wavelet coefficient frames with the embedded 

such a computer is shown in FIG. 3. Computer 16 is a watermarks are then converted back to the temporal domain 
desktop computer, and may be of any type, including a 65 using the inverse wavelet transform. As the watermark is 

PC-compatible computer, an Apple Macintosh computer, a designed and embedded in the wavelet domain, the indi- 

UNIX-compatible computer, etc. Computer 16 usually vidua! watermarks for each wavelet coefficient frame are 
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Spread out to varying levels of support in the temporal 
domain. For example, watermarks embedded in highpass 
wavelet frames are localized temporally. Conversely, water- 
marks embedded in lowpass wavelet frames are generally 
located throughout the scene in the temporal domain. S 

The watermarks embedded within the video data accord- 
ing to the method of FIG. 4 should be extractable even if 
common signal processing operations are applied to the host 
data. This is particularly true in the case of deliberate 
unauthorized attempts to remove the watermark. For 10 
example, a pirate may attempt to add noise, filter, code, 
re-scale, etc., the host data in an attempt to destroy the 
watermark. The embedded watermark, however, is noise- 
like and its location over multiplied blocks of the host data, 
over successive frames of the data, is unknown. Therefore, 15 
the pirate has insufficient knowledge to directly remove the 
watermark. Furthermore, a different signature is used for 
each block to further reduce unauthorized watermark 
removal by cross correlation. Any destruction attempts are 
done blindly. 2£i 

Detection of the watermark is accomplished via general- 
ized likelihood ratio test. Two methods have been developed 
to extract the potential watermark from a test video or test 
video frame. Both employ hypothesis testing. One test 
employs index knowledge during detection, i.e., the place- 
ment of the test video frame(s) relative to the original video 
is known. The second detection method does not require 
knowledge of the location of the test frame(s). This is 
extremely useful in a video setting, where lOOO's of frames 
may be similar, and it is uncertain where the test frames 
reside. 

In the first method, watermark detection with index 
knowledge, when the location of the test frame is known, a 
straightforward hypothesis test may applied. For each frame 
in the test video Rk, a hypothesis test is performed. 

HO: Xk=Rk-Fk=Nk (no watermark) 

HI: XkoRk-Fk=W*k+Nk (watermark) 
where Fk is the original frame, W*k is the (potentially 
modified) watermark recovered from the frame, and Nk is 40 
noise. The hypothesis decision is obtained by computing the 
scalar similarity between each extracted signal and original 
watermark Wk: Sk=Simk(Xk, Wk)=(Xk*Wk)/(Wk*Wk). 
The overall similarity between the extracted and original 
watermark is computed as the mean of Sk for all k: S=mean 45 
(Sk). The overall similarity is conapared with a threshold to 
determine whether the test video is watermarked. The 
experimental threshold is desirably chosen around 0.1, i.e., 
a similarity value >=0.1 indicates the presence of the own- 
er's copyright. In such a case, the video is deemed the 50 
property of the author, and a copyright claim is valid. A 
similarity value <0.1 indicates the absence of a watermark. 

When the length (in terms of frames) of the test video is 
the same as the length of the original video, the hypothesis 
test is performed in the wavelet domain. A temporal wavelet 55 
transform of the test video is computed to obtain its wavelet 
coeflficienl frames R-k. Thus, 

HO: X-k«R-k-F-k=Nk (no watermark) 

HI: X~koR~k-F-.k=W-*k+Nk (watermark) 
where F~k are the wavelet coefficient frames from the 60 
original video, W--*k is the potentially modified watermarks 
from each frame, and Nk is noise. This test is performed for 
each wavelet frame to obtain X-k for all k. Similarity values 
are computed as before, Sk«Simk(X-k,W~k). 

Using the original video signal to detect the presence of 65 
a waterooark, virtually all types of distortioos can be 
handled, including cropping, rotation, reseahng, etc., by 
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employing a generalized likelihood ratio test. A second 
detection scheme which is capable of recovering a water- 
mark after many distortions without a generalized likelihood 
ratio test has also been developed. The procedure is fast and 
simple, particulariy when confronted with the large amount 
of data associated with video. 

In the method for watermark detection without index 
knowledge, there is no knowledge of the indices of the test 
frames. Pirate tampering may lead to many types of derived 
videos which arc often difficult to process. For example, a 
pirate may steal one frame from a video. A pirate may also 
create a video which is not the same length as the original 
video. Temporal cropping, frame dropping, and frame inter- 
polation are all examples. A pirate may also swap the order 
of the frames. Most of the better watermarking schemes 
currently available use different watermarks for different 
images. As such, they generally require knowledge of which 
frame was stolen. If they are unable to ascertain which frame 
was stolen, they are unable to determine which watermark 
was used. 

This method can extract the watermark without knowl- 
edge of where a frame belongs in the video sequence. No 
information regarding cropping, frame order, interpolated 
frames, etc., is required. As a result, no searching and 
correlation computations are required to locate the test frame 
index. The hypothesis test is formed by removing the low 
temporal wavelet frame from the test frame and computing 
the similarity with the watermark for the low temporal 
wavelet frame. The hypothesis test is formed as 

HO: Xk=Rk-F-0=Nk (no watermark) 

HI: Xk=Rk-F-0=W-*k+Nk (watermark) 
where Rk is the test frame in the spatial domain and F~0 is 
the lowest temporal wavelet frame. The hypothesis decision 
is made by computing the scalar similarity between each 
extracted signal Xk and original watermark for the low 
temporal wavelet frame W~0: Simk(Xk, W-0). This simple 
yet powerful approach exploits the wavelet property of 
varying temporal support. 

Although specific embodiments have been illustrated and 
described herein, it will be appreciated by those of ordinary 
skill in the art that any arrangement which is calculated to 
achieve the same purpose may be substituted for the specific 
embodiments shown. This application is intended to cover 
any adaptations or variations of the present invention. 
Therefore, it is manifestly intended that this invention be 
limited only by the following claims and equivalents 
thereof. 

I claim: 

1. A computerized method for embedding data represent- 
ing a watermark into host data relating to video: 

generating the data representing the watennark; 

subjecting the host data to a temporal wavelet transform; 

embedding the data into the host data, as subjected to the 
temporal wavelet transform, in accordance with a per- 
ceptual mask conducted in the frequency domain; and, 

subjecting the host data, including the data embedded 
therein, to a non-frequency mask. 

2. The computerized method of claim 1, wherein the data 
representing the watermark comprises a pseudo-random 
sequence. 

3. The computerized method of claim 1, wherein gener- 
ating the data representing the watermark uses a pseudo- 
random generator and two keys to generate the data. 

4. The computerized method of claim 3, wherein the 
pseudo-random generator is selected from the group com- 
prising RSA, Rabin, Blum/Micali, and Blum/Blum/Sbub. 

5. The computerized method of claim 1, wherein the 
perceptual mask comprises a model in which a contrast 
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threshold at a frequency f is expressed as a function of the 
frequency f, a masking frequency f„ and a masking contrast 

Cm, 

c(f,O^Jf)Max{i, [mJcJTl 

where cjf) is a detection threshold at the frequency f. 

6. The computerized method of claim 1, wherein the 
no n- frequency mask comprises a spatial mask. 

7. The computerized method of claim 1, wherein subject- 
ing the host data to a temporal wavelet transform results in 
a multiresolution temporal representation of the video hav- 
ing temporal lowpass frames and temporal highpass frames. 

8. A scene-based computerized method of watermarking 
host data relating to video comprising: 

segmenting the host data into a plurality of scenes, each 
scene having a plurality of frames; 

subjecting each frame of each scene to a temporal wavelet 
transform; and, 

embedding each frame of each scene, as has been sub- 
jected to the temporal wavelet transform, with a water- 
mark in accordance with perceptual and spatial char- 
acteristics of the frame. 

9. The scene-based computerized method of claim 8, 
wherein subjecting each frame of each scene to the temporal 
wavelet transform results in lowpass wavelet frames and 
highpass wavelet frames. 

10. The scene-based computerized method of claim 9, 
wherein watermarks embedded in lowpass wavelet frames 
are located throughout the scene in a temporal domain. 

11. The scene-based computerized method of claim 9, 
wherein watermarks embedded in highpass wavelet frames 
are localized temporally. 

12. A computerized system for watermarking host data 
relating to video and having a plurality of scenes, each scene 
having a plurality of frames, comprising: 

a processor; 

a computer-readable medium; 

computer-executable instructions executed by the proces- 
sor from the computer-readable medium comprising: 
applying a temporal wavelet transfonn to each frame; 
segmenting each frame of each scene into blocks; 
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applying a discrete cosine transform (DC!) to each 
block to generate a frequency block corresponding to 
the block; 

generating a perceptual mask for each frequency block; 
applying the DCT to a watermark for each frequency 
block; 

weighting the perceptual mask for each frequency 
block with the watermark for the frequency block to 
which the DCT has been applied to generate a 
frequency-shaped author block; 

applying an inverse DCT to each frequency ^aped 
author block to generate a time-domain block; 

generating a spatial mask for each block; 

weighting each time-domain block by a spatial mask to 
generate a watermark block; and, 

adding each block to a corresponding watermark block 
to generate a watermarked block. 

13. A computer-readable medium having a computer 
program stored thereon to cause a suitable equipped com- 
puter to perform a method comprising: 

applying a temporal wavelet transform to each frame; 
segmenting each frame of each scene into blodcs; 
applying a discrete cosine transform (DCI) to each block 

to generate a frequency block corresponding to the 

block; 

generating a perceptual mask for each frequency block; 
applying the DCT to a watermark for each frequency 
block; 

weighting the perceptual mask for each frequency block 
with the watermark for the frequency block to which 
the DCT has been applied to generate a frequency- 
shaped author block; 

applying an inverse DCT to each frequency-shaped author 
block to generate a time-domain block; 

generating a spatial mask for each block; 

weighting each time-domain block by a spatial mask to 
generate a watermark block; and, 

adding each block to a corresponding watermark block to 
generate a watermarked block. 

14. The computer-readable medium of claim 13, wherein 
the computer-readable medium is a floppy disk. 
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